Pod のステータスが Ready にならない

事象内容

Kubernetes クラスタ内の Pod 同士の通信（ Pod ネットワーク）に Flannel をデプロイしても、Pod のステータスが NotReady から Readyにならない。

下記は、Flannel をデプロイした状態

[root@kube-master ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds created
[root@kube-master ~]#

下記は、Master ノードの動作状態。kube-master の Pod があり、Pod のSTATUS が NotReady になっている。

[root@kube-master ~]# kubectl get nodes
NAME          STATUS     ROLES    AGE     VERSION
kube-master   NotReady   master   3m21s   v1.13.1
[root@kube-master ~]#

下記は、Pod の kube-master の詳細を確認した結果。Conditions の項目で、ネットワークプラグイン（ network plugin ）が cni 設定が初期化されていないために ready ではないといった状態になっている。

[root@kube-master ~]# kubectl describe node kube-master
Name:               kube-master
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=kube-master
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 04 Jan 2019 23:47:36 +0900
Taints:             node.kubernetes.io/not-ready:NoExecute
                    node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 04 Jan 2019 23:54:06 +0900   Fri, 04 Jan 2019 23:47:31 +0900   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 04 Jan 2019 23:54:06 +0900   Fri, 04 Jan 2019 23:47:31 +0900   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 04 Jan 2019 23:54:06 +0900   Fri, 04 Jan 2019 23:47:31 +0900   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Fri, 04 Jan 2019 23:54:06 +0900   Fri, 04 Jan 2019 23:47:31 +0900   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
  InternalIP:  192.168.25.100
  Hostname:    kube-master
Capacity:
 cpu:                2
 ephemeral-storage:  38770180Ki
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             1882188Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  35730597829
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             1779788Ki
 pods:               110
System Info:
 Machine ID:                 7c68e8ca9a48f08b0bf8f6ac68465636
 System UUID:                7C68E8CA-9A48-F08B-0BF8-F6AC68465636
 Boot ID:                    7d2bcf50-e9b8-46c9-97d0-a0ce68ad689b
 Kernel Version:             3.10.0-957.1.3.el7.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.9.0
 Kubelet Version:            v1.13.1
 Kube-Proxy Version:         v1.13.1
PodCIDR:                     10.244.0.0/24
Non-terminated Pods:         (5 in total)
  Namespace                  Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                   ------------  ----------  ---------------  -------------  ---
  kube-system                etcd-kube-master                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         5m23s
  kube-system                kube-apiserver-kube-master             250m (12%)    0 (0%)      0 (0%)           0 (0%)         5m20s
  kube-system                kube-controller-manager-kube-master    200m (10%)    0 (0%)      0 (0%)           0 (0%)         5m32s
  kube-system                kube-proxy-586mz                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m16s
  kube-system                kube-scheduler-kube-master             100m (5%)     0 (0%)      0 (0%)           0 (0%)         5m13s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                550m (27%)  0 (0%)
  memory             0 (0%)      0 (0%)
  ephemeral-storage  0 (0%)      0 (0%)
Events:
  Type    Reason                   Age                    From                     Message
  ----    ------                   ----                   ----                     -------
  Normal  Starting                 6m42s                  kubelet, kube-master     Starting kubelet.
  Normal  NodeHasSufficientMemory  6m42s (x8 over 6m42s)  kubelet, kube-master     Node kube-master status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    6m42s (x8 over 6m42s)  kubelet, kube-master     Node kube-master status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     6m42s (x7 over 6m42s)  kubelet, kube-master     Node kube-master status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  6m42s                  kubelet, kube-master     Updated Node Allocatable limit across pods
  Normal  Starting                 6m15s                  kube-proxy, kube-master  Starting kube-proxy.
[root@kube-master ~]#

下記は、全ての Pod の動作状況を確認した結果。 coredns の STATUS が Pending になっている。

[root@kube-master ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE
kube-system   coredns-86c58d9df4-tr4hm              0/1     Pending   0          2m3s
kube-system   coredns-86c58d9df4-z8k89              0/1     Pending   0          2m3s
kube-system   etcd-kube-master                      1/1     Running   0          6m10s
kube-system   kube-apiserver-kube-master            1/1     Running   0          6m7s
kube-system   kube-controller-manager-kube-master   1/1     Running   0          6m19s
kube-system   kube-proxy-586mz                      1/1     Running   0          7m3s
kube-system   kube-scheduler-kube-master            1/1     Running   0          6m
[root@kube-master ~]#

解決方法

Kubenetes で使用している coredns が正常に起動していないことが原因で、デプロイする Flannel を変更したら解決されました。

kubenetes クラスタの初期化を一度リセットします。

[root@kube-master ~]# kubeadm reset
[reset] WARNING: changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] are you sure you want to proceed? [y/N]: y
[preflight] running pre-flight checks
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[reset] stopping the kubelet service
[reset] unmounting mounted directories in "/var/lib/kubelet"
[reset] deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]
[reset] deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually.
For example: 
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

[root@kube-master ~]#

kubenetes クラスタを再度初期化します。

[root@kube-master ~]# kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
	[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.25.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-master localhost] and IPs [192.168.25.100 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-master localhost] and IPs [192.168.25.100 127.0.0.1 ::1]
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 18.502948 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-master" as an annotation
[mark-control-plane] Marking the node kube-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kube-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: br26xc.mn3s5j7a69ky7js0
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.25.100:6443 --token br26xc.mn3s5j7a69ky7js0 --discovery-token-ca-cert-hash sha256:0d30df7dcda908e16a58a77f36a429351f42689bed80ca8858c57f887d9d32a2

[root@kube-master ~]#

別の Flannel をデプロイします。

[root@kube-master ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created
[root@kube-master ~]#

下記は、全ての Pod の動作状況を確認した結果。 coredns の STATUS が Running になります。

[root@kube-master ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE
kube-system   coredns-86c58d9df4-2nwhp              1/1     Running   0          78s
kube-system   coredns-86c58d9df4-qw5k4              1/1     Running   0          78s
kube-system   etcd-kube-master                      1/1     Running   0          12s
kube-system   kube-apiserver-kube-master            1/1     Running   0          24s
kube-system   kube-controller-manager-kube-master   1/1     Running   0          21s
kube-system   kube-flannel-ds-amd64-chkf8           1/1     Running   0          54s
kube-system   kube-proxy-246mb                      1/1     Running   0          78s
kube-system   kube-scheduler-kube-master            1/1     Running   0          9s
[root@kube-master ~]#