Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kind stop responding few minutes after cluster creation #2221

Closed
georgeslb opened this issue Apr 28, 2021 · 7 comments
Closed

Kind stop responding few minutes after cluster creation #2221

georgeslb opened this issue Apr 28, 2021 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@georgeslb
Copy link

What happened:
I created a kind cluster two weeks ago.
It was working as It should do for 2 weeks
For some reasons we had to restart docker daemon.
Because of the issue #148, we delete old cluster and create a new one exactly the same way

the new cluster starts well
the new cluster works fine for few minutes. It then stops answering

here's what command kubectl get pods -A | wc -l gives every 10 seconds after cluster starts

Time 14:53:48
pods 10
Time 14:53:58
pods 10
Time 14:54:08
pods 10
Time 14:54:18
pods 10
Time 14:54:28
pods 10
Time 14:54:38
pods 10
Time 14:54:48
pods 10
Time 14:54:58
pods 10
Time 14:55:08
pods 10
Time 14:55:18
pods 10
Time 14:55:28
pods 10
Time 14:55:38
pods 10
Time 14:55:48
pods 10
Time 14:55:59
pods 10
Time 14:56:09
pods 10
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get pods)
Time 14:56:29
pods 0
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get pods)
Time 14:56:49
pods 0
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get pods)
Time 14:57:09
pods 0
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get pods)
Time 14:57:29
pods 0

What you expected to happen:
I expect Kind to work as usual
I tried to :
Delete the .kube folder
Delete/Start cluster many times

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
Script to create cluster

`
export HTTP_PROXY=...
export HTTPS_PROXY=...
export http_proxy=...
export https_proxy=...
export NO_PROXY=...
export no_proxy=...
cat <<EOF | kind create cluster --name=k8s-cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:

  • |-
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry:5555"]
    endpoint = ["http://registry:5555"]
    nodes:
  • role: control-plane
    kubeadmConfigPatches:
    • |
      kind: InitConfiguration
      nodeRegistration:
      kubeletExtraArgs:
      node-labels: "ingress-ready=true"
      extraPortMappings:
    • containerPort: 80
      hostPort: 80
      protocol: TCP
    • containerPort: 443
      hostPort: 443
      protocol: TCP
      extraMounts:
    • hostPath: /var/opt/data/cluster-k8s
      containerPath: /data
      EOF
      `

cluster container still up
docker ps

a1eb4ee25e2a kindest/node:v1.20.2 "/usr/local/bin/entr…" 16 minutes ago Up 16 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 127.0.0.1:40771->6443/tcp k8s-cluster-control-plane

kubectl version --v=7

I0428 15:11:30.805532 74379 loader.go:379] Config loaded from file: /root/.kube/config
I0428 15:11:30.823022 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:30.823034 74379 round_trippers.go:429] Request Headers:
I0428 15:11:30.823039 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:30.823044 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:30.824001 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:30.826210 74379 request.go:943] Got a Retry-After 1s response for attempt 1 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:31.826355 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:31.826375 74379 round_trippers.go:429] Request Headers:
I0428 15:11:31.826380 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:31.826385 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:31.826794 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:31.826839 74379 request.go:943] Got a Retry-After 1s response for attempt 2 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:32.827006 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:32.827032 74379 round_trippers.go:429] Request Headers:
I0428 15:11:32.827038 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:32.827042 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:32.827616 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:32.827671 74379 request.go:943] Got a Retry-After 1s response for attempt 3 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:33.827820 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:33.827840 74379 round_trippers.go:429] Request Headers:
I0428 15:11:33.827846 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:33.827851 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:33.828231 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:33.828272 74379 request.go:943] Got a Retry-After 1s response for attempt 4 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:34.828402 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:34.828424 74379 round_trippers.go:429] Request Headers:
I0428 15:11:34.828430 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:34.828435 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:34.828858 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:34.828894 74379 request.go:943] Got a Retry-After 1s response for attempt 5 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:35.829050 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:35.829068 74379 round_trippers.go:429] Request Headers:
I0428 15:11:35.829074 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:35.829079 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:35.829420 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:35.829501 74379 request.go:943] Got a Retry-After 1s response for attempt 6 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:36.829660 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:36.829681 74379 round_trippers.go:429] Request Headers:
I0428 15:11:36.829687 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:36.829693 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:36.830203 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:36.830246 74379 request.go:943] Got a Retry-After 1s response for attempt 7 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:37.830372 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:37.830392 74379 round_trippers.go:429] Request Headers:
I0428 15:11:37.830398 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:37.830403 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:37.830776 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:37.830817 74379 request.go:943] Got a Retry-After 1s response for attempt 8 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:38.830959 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:38.830980 74379 round_trippers.go:429] Request Headers:
I0428 15:11:38.830986 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:38.830991 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:38.831489 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:38.831522 74379 request.go:943] Got a Retry-After 1s response for attempt 9 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:39.831691 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:39.831710 74379 round_trippers.go:429] Request Headers:
I0428 15:11:39.831716 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:39.831721 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:39.832112 74379 round_trippers.go:448] Response Status: in 0 milliseconds
I0428 15:11:39.832142 74379 request.go:943] Got a Retry-After 1s response for attempt 10 to https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:40.832295 74379 round_trippers.go:422] GET https://127.0.0.1:40771/version?timeout=32s
I0428 15:11:40.832314 74379 round_trippers.go:429] Request Headers:
I0428 15:11:40.832320 74379 round_trippers.go:433] Accept: application/json, /
I0428 15:11:40.832325 74379 round_trippers.go:433] User-Agent: kubectl/v1.20.4 (linux/amd64) kubernetes/e87da0b
I0428 15:11:40.832937 74379 round_trippers.go:448] Response Status: in 0 milliseconds
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
I0428 15:11:40.833167 74379 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "an error on the server ("") has prevented the request from succeeding",
"reason": "InternalError",
"details": {
"causes": [
{
"reason": "UnexpectedServerResponse"
}
],
"retryAfterSeconds": 1
},
"code": 500
}]
F0428 15:11:40.833192 74379 helpers.go:115] Error from server (InternalError): an error on the server ("") has prevented the request from succeeding
goroutine 1 [running]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc00000e001, 0xc0007de2d0, 0x97, 0xe8)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x2a74040, 0xc000000003, 0x0, 0x0, 0xc0003fa000, 0x28546e3, 0xa, 0x73, 0x40e200)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:975 +0x19b
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).printDepth(0x2a74040, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0x2, 0xc000070ef0, 0x1, 0x1)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:732 +0x16f
k8s.io/kubernetes/vendor/k8s.io/klog/v2.FatalDepth(...)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1488
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.fatal(0xc0007c4380, 0x68, 0x1)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:93 +0x26e
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.checkErr(0x1cf4220, 0xc00065a780, 0x1bbf138)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:188 +0x945
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.CheckErr(...)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:115
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/version.NewCmdVersion.func1(0xc0008082c0, 0xc00024a620, 0x0, 0x1)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/version/version.go:79 +0x117
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc0008082c0, 0xc00024a610, 0x1, 0x1, 0xc0008082c0, 0xc00024a610)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:854 +0x2c2
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000144b00, 0xc00007a180, 0xc00003a180, 0x3)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:958 +0x375
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(...)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:895
main.main()
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubectl/kubectl.go:49 +0x21d

goroutine 6 [chan receive]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).flushDaemon(0x2a74040)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1169 +0x8b
created by k8s.io/kubernetes/vendor/k8s.io/klog/v2.init.0
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:417 +0xdf

goroutine 33 [select]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x1bbf058, 0x1cf4520, 0xc00053c000, 0x1, 0xc00007eb40)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:167 +0x149
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x1bbf058, 0x12a05f200, 0x0, 0x1, 0xc00007eb40)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0x1bbf058, 0x12a05f200, 0xc00007eb40)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d
created by k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/util/logs.InitLogs
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/util/logs/logs.go:51 +0x96

goroutine 69 [runnable]:
net/http.setRequestCancel.func4(0x0, 0xc00002e420, 0xc00031bae0, 0xc00024fde8, 0xc0007caf00)
/usr/local/go/src/net/http/client.go:398 +0xe5
created by net/http.setRequestCancel
/usr/local/go/src/net/http/client.go:397 +0x337
`
Environment:

  • kind version: (use kind version): kind v0.10.0 go1.15.7 linux/amd64
  • Kubernetes version: (use kubectl version): Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
    - Docker version: (use docker info):
    Client:
    Context: default
    Debug Mode: false
    Plugins:
    app: Docker App (Docker Inc., v0.9.1-beta3)
    buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
Containers: 17
Running: 8
Paused: 0
Stopped: 9
Images: 185
Server Version: 20.10.4
Storage Driver: devicemapper
Pool Name: infravg-thinpool
Pool Blocksize: 131.1kB
Base Device Size: 64.42GB
Backing Filesystem: xfs
Udev Sync Supported: true
Data Space Used: 217.7GB
Data Space Total: 257.7GB
Data Space Available: 40.03GB
Metadata Space Used: 81.35MB
Metadata Space Total: 125.8MB
Metadata Space Available: 44.48MB
Thin Pool Minimum Free Space: 25.77GB
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.170-RHEL7 (2020-03-24)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive

  • OS (e.g. from /etc/os-release):
    NAME="Red Hat Enterprise Linux Server"
    VERSION="7.3 (Maipo)"
    ID="rhel"
    ID_LIKE="fedora"
    VERSION_ID="7.3"
    PRETTY_NAME="Red Hat"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:redhat:enterprise_linux:7.3:GA:server"
    HOME_URL="https://www.redhat.com/"
    BUG_REPORT_URL="https://bugzilla.redhat.com/"

Thanks

@georgeslb georgeslb added the kind/bug Categorizes issue or PR as related to a bug. label Apr 28, 2021
@aojea
Copy link
Contributor

aojea commented Apr 28, 2021

is it possible that you are hitting some system resources constraints?

Data Space Used: 217.7GB
Data Space Total: 257.7GB
Data Space Available: 40.03GB

@georgeslb
Copy link
Author

Hi @aojea,
I do not think it's a space issue, how shall I proceed to confirm this ?

I investigated more, here's what I've got :

few seconds after cluster is ready, "curling" exposed kind container port is ok

docker ps
f461fa6884bd kindest/node:v1.20.2 "/usr/local/bin/entr…" 9 seconds ago Up 5 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 127.0.0.1:34527->6443/tcp k8s-cluster-control-plane

curl --insecure https://127.0.0.1:34527
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "forbidden: User "system:anonymous" cannot get path "/"",
"reason": "Forbidden",
"details": {
},
"code": 403
}

few minutes later "curling" exposed kind container port gives an error

curl --insecure https://127.0.0.1:34527
curl: (35) Encountered end of file

however, curling internal container port still ok

docker exec -ti k8s-cluster-control-plane curl --insecure https://127.0.0.1:6443
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "forbidden: User "system:anonymous" cannot get path "/"",
"reason": "Forbidden",
"details": {
},
"code": 403
}

I tried to restart docker daemon, but still have the same issue

@BenTheElder
Copy link
Member

The difference between the exposed port and the internal port is purely docker's port forwarding, which kind is not at all responsible for ...

@aojea
Copy link
Contributor

aojea commented Apr 28, 2021

hmm, are you running firewalld?
best way to find out is to do iptables-save > iptables.txt when is working and when is not, then diff the dump

@georgeslb
Copy link
Author

The difference between the exposed port and the internal port is purely docker's port forwarding, which kind is not at all responsible for ...

This behavior is only happening on my Kind's container, other containers on the same daemon have exposed ports too and they working good.

What is really strange is that exposed port is working good few minutes after starting cluster, before things go bad.

Do you know if there any specific logs that may help me?

I'll keep searching for an explanation

@aojea
Copy link
Contributor

aojea commented Apr 28, 2021

hmm, are you running firewalld?
best way to find out is to do iptables-save > iptables.txt when is working and when is not, then diff the dump

attach the iptables dump when working and when not, that can give us the answer

@georgeslb
Copy link
Author

I ended up deleting the "kind" network
docker network rm kind
It seems working now ... well it's been working since 30 minutes, a record for today :)
Thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants