When filtering etcd or api-server pods, the readiness status of the pods was not checked. #3021

yxxhero · 2024-02-12T12:28:29Z

What keywords did you search in kubeadm issues before filing this one?

If you have found any duplicates, you should instead reply there and close this page.

If you have not found any duplicates, delete this section and continue on.

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT or FEATURE REQUEST
BUG

Versions

kubeadm version (use kubeadm version): all

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Container runtime (CRI) (e.g. containerd, cri-o):
Container networking plugin (CNI) (e.g. Calico, Cilium):
Others:

What happened?

When filtering etcd or api-server pods, the readiness status of the pods was not checked.

What you expected to happen?

When filtering etcd or api-server pods, the readiness status of the pods was checked.

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

The text was updated successfully, but these errors were encountered:

neolit123 · 2024-02-12T12:37:30Z

you need to provide more information.

yxxhero · 2024-02-12T12:59:32Z

func getRawEtcdEndpointsFromPodAnnotationWithoutRetry(client clientset.Interface) ([]string, int, error) {
	klog.V(3).Infof("retrieving etcd endpoints from %q annotation in etcd Pods", constants.EtcdAdvertiseClientUrlsAnnotationKey)
	podList, err := client.CoreV1().Pods(metav1.NamespaceSystem).List(
		context.TODO(),
		metav1.ListOptions{
			LabelSelector: fmt.Sprintf("component=%s,tier=%s", constants.Etcd, constants.ControlPlaneTier),
		},
	)
	if err != nil {
		return []string{}, 0, err
	}
	etcdEndpoints := []string{}
	for _, pod := range podList.Items {
		etcdEndpoint, ok := pod.ObjectMeta.Annotations[constants.EtcdAdvertiseClientUrlsAnnotationKey]
		if !ok {
			klog.V(3).Infof("etcd Pod %q is missing the %q annotation; cannot infer etcd advertise client URL using the Pod annotation", pod.ObjectMeta.Name, constants.EtcdAdvertiseClientUrlsAnnotationKey)
			continue
		}
		etcdEndpoints = append(etcdEndpoints, etcdEndpoint)
	}
	return etcdEndpoints, len(podList.Items), nil
}

we can see the code above. it doesn't check the ready status of pod. @neolit123

neolit123 · 2024-02-12T13:52:08Z

we can see the code above. it doesn't check the ready status of pod. @neolit123

why is it a problem that we need to fix?

yxxhero · 2024-02-12T23:50:43Z

etcd-node001                                      0/1     CrashLoopBackOff    18074 (3m20s ago)   4d16h   1.1.1.1        node001     <none>           <none>
etcd-node002                                      1/1     Running             12 (6d3h ago)       69d     1.1.1.1       node002     <none>           <none>
etcd-node005                                      1/1     Running             10 (8d ago)         69d     1.1.1.1       node005     <none>           <none>
etcd-node003                                      1/1     Running             8 (19d ago)         67d     1.1.1.1       node003     <none>           <none>
etcd-node004                                      1/1     Running             7 (60d ago)         67d     1.1.1.1        node004     <none>           <none>

see the above. shall we add the unhealth pod as etcd node? @neolit123 thanks so much.

neolit123 · 2024-02-13T06:49:57Z

i think getRawEtcdEndpointsFromPodAnnotationWithoutRetry is supposed to get the annotations even if the pod is not ready. readyness is checked during upgrade.

are you seeing this as an actual problem on your clusters?

yxxhero · 2024-02-13T23:58:38Z

@neolit123 make sense. maybe we can log something to nitice user?

neolit123 · 2024-02-14T10:25:57Z

we could log with klog.V(3).Infof if the Pod is not ready. to my understanding this is what you'd like to be changed?

yxxhero · 2024-02-14T10:46:09Z

Yeah. It's ok for now. I will do that. Alright？ Thanks So much @neolit123

neolit123 added kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Feb 12, 2024

neolit123 added this to the v1.30 milestone Feb 12, 2024

yxxhero mentioned this issue Feb 24, 2024

feat: print etcd ready status kubernetes/kubernetes#123489

Merged

k8s-ci-robot closed this as completed in kubernetes/kubernetes#123489 Feb 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When filtering etcd or api-server pods, the readiness status of the pods was not checked. #3021

When filtering etcd or api-server pods, the readiness status of the pods was not checked. #3021

yxxhero commented Feb 12, 2024

neolit123 commented Feb 12, 2024

yxxhero commented Feb 12, 2024

neolit123 commented Feb 12, 2024

yxxhero commented Feb 12, 2024

neolit123 commented Feb 13, 2024

yxxhero commented Feb 13, 2024

neolit123 commented Feb 14, 2024

yxxhero commented Feb 14, 2024 •

edited

Loading

When filtering etcd or api-server pods, the readiness status of the pods was not checked. #3021

When filtering etcd or api-server pods, the readiness status of the pods was not checked. #3021

Comments

yxxhero commented Feb 12, 2024

What keywords did you search in kubeadm issues before filing this one?

Is this a BUG REPORT or FEATURE REQUEST?

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

neolit123 commented Feb 12, 2024

yxxhero commented Feb 12, 2024

neolit123 commented Feb 12, 2024

yxxhero commented Feb 12, 2024

neolit123 commented Feb 13, 2024

yxxhero commented Feb 13, 2024

neolit123 commented Feb 14, 2024

yxxhero commented Feb 14, 2024 • edited Loading

yxxhero commented Feb 14, 2024 •

edited

Loading