Scaling the master group

Overview

Worker node groups may be scaled up and down at any time just by changing the number of nodes in the group in the cluster spec.

Unlike workers, the master group nodes are running etcd cluster, which keeps all the Kubernetes cluster data, and scaling it up and down is more involved.

This article describes procedure for scaling the master group up.

Constraints and pre-requisites

  1. Currently the procedure requires a cluster downtime.

  2. Only iterative group size increase is supported, e.g. 1->3 and 3->5

  3. Only scaling the master group up is supported.

  4. Kubernetes cluster must be in a healthy state with all master and worker nodes up and running and all conditions in green state before the master group can be scaled.

  5. Kublr agent of the following versions or higher must be used:

    • 1.17.4-9 (Kubernetes 1.17.4)
    • 1.17.8-2 (Kubernetes 1.17.8)
    • 1.18.5-2 (Kubernetes 1.18.5)
  6. Etcd cluster must be in a healthy state with all members up and running.

    You can validate it by running etcdctl command in etcd pods:

    • List etcd pods:

      kubectl get pods -n kube-system -o name | grep etcd
      

      Example output:

      pod/k8s-etcd-3f2598adc74694e693d2efa5733868229158fa22aa0388ed5d89ac53c1464159-ip-172-16-13-136.ec2.internal
      pod/k8s-etcd-bb9ac49174362ec876f2f9be250d14c9279cd8883f8376768dd808337961d2a9-ip-172-16-12-207.ec2.internal
      pod/k8s-etcd-f90d086594e05b363c9636dc21036fbe6aac21b84fa94b2eba3fc521ff681af5-ip-172-16-12-214.ec2.internal
      
    • Check etcd cluster members on each master:

      for P in $(kubectl get pods -n kube-system -o name | grep etcd) ; do \
        echo -e "\nEtcd members on pod ${P}:\n"
        kubectl exec -n kube-system "${P}" -- etcdctl member list 2>/dev/null
      done
      

      Example output

      Etcd members on pod pod/k8s-etcd-bb9ac49174362ec876f2f9be250d14c9279cd8883f8376768dd808337961d2a9-ip-172-16-12-207.ec2.internal:
      
      d91c8b1e8e5a44dc, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.12.207:2379, false
      
    • Check etcd cluster endpoints health on each master:

      for P in $(kubectl get pods -n kube-system -o name | grep etcd) ; do \
        echo -e "\nEtcd endpoints on pod ${P}:\n"
        kubectl exec -n kube-system "${P}" -- etcdctl endpoint health --cluster
      done
      

      Example output

      Etcd endpoints on pod pod/k8s-etcd-bb9ac49174362ec876f2f9be250d14c9279cd8883f8376768dd808337961d2a9-ip-172-16-12-207.ec2.internal:
      
      https://172.16.12.207:2379 is healthy: successfully committed proposal: took = 10.515741ms
      

Scaling up from 1 to 3 masters

NB!!! Please note that if you want to scale up from 1 to 5 members, you MUST first follow the scaling procedure from 1 to 3 members, and then the scaling precedure from 3 to 5 members. Scaling from 1 to 5 members directly is not supported

Adding the 2nd master

  1. Change cluster specification as follows:

    • Set etcd initial cluster state parameter to existing and specify 2 etcd members:

      spec:
        ...
        master:
          kublrAgentConfig:
            kublr:
              etcd_flag:
                initial_cluster_state: '--initial-cluster-state=existing'
                initial_cluster: '--initial-cluster=etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380'
      

      NB!!! Use the etcd<N>.kublr.local addresses for --initial-cluster switch in all cases except when you used DNS names for masters rather than IP addresses. This is usually only possible for baremetal, vSphere, and/or VCD deployments, and possibly in certain experimantal configuration for other infrastructure providers.

    • Change the upgrade strategy to disable node draining and upgrading all nodes at once.

      Upgrade strategy can be changed back to normal after scaling master group is complete.

      spec:
        ...
        master:
          ...
          updateStrategy:
            drainStrategy:
              skip: true             # skip draining master nodes
            rollingUpdate:
              maxUnavailable: 100%   # upgrade all nodes at once
            type: RollingUpdate
        ...
        nodes:
          ...
          -
            ...
            updateStrategy:
              drainStrategy:
                skip: true           # skip draining worker nodes
              rollingUpdate:
                maxUnavailable: 100% # upgrade all nodes at once
              type: RollingUpdate
        ...
        updateStrategy:
          rollingUpdate:
            maxUpdatedGroups: 1
          type: ResetUpdate          # upgrade all groups at once
      
    • Increase the number of master nodes from 1 to 3 in the cluster specification (in addition to common changes described in the previous section).

      Don’t forget to adjust availability zones accordingly if they are used (e.g. in AWS, GCP, optionally in Azure and vSphere).

      For on-prem (“bring your ouw infrastructure”) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 3 # change from 1 to 3
          maxNodes:     3 # change from 1 to 3
          minNodes:     3 # change from 1 to 3
          
          locations: # for on-prem/BYOI clusters only ...
            -
              baremetal:
                hosts:
                  - address: 192.168.56.21
                  - address: 192.168.56.22 # specify additional master nodes' addresses
                  - address: 192.168.56.23 # specify additional master nodes' addresses
      

      For on-prem (vSphere) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 3 # change from 1 to 3
          maxNodes:     3 # change from 1 to 3
          minNodes:     3 # change from 1 to 3
          
          locations: # for on-prem/BYOI clusters only ...
            -
              vSphere:
                ipAddresses:
                  - 192.168.56.21
                  - 192.168.56.22 # specify additional master nodes' addresses
                  - 192.168.56.23 # specify additional master nodes' addresses
      

      For on-prem (vCloud Director) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 3 # change from 1 to 3
          maxNodes:     3 # change from 1 to 3
          minNodes:     3 # change from 1 to 3
      
          locations: # for on-prem/BYOI clusters only ...
            -
              vcd:
                ipAddresses:
                  - 192.168.56.21
                  - 192.168.56.22 # specify additional master nodes' addresses
                  - 192.168.56.23 # specify additional master nodes' addresses
      
  2. Run cluster update and wait until all new masters are available and green in the cluster console and the cluster is back to green healthy state.

  3. Add the 2nd master etcd; run the following commands from the cluster CLI

    Set kubectl alias for shorter commands:

    alias k='kubectl -n kube-system'
    

    List etcd pods:

    # List etcd pods
    
    $ k get pods -o name | grep etcd
    
    pod/k8s-etcd-01051b38137ab69ab204db17e586f37abdf968f01a997b568c2e00338fbdf4e8-192.168.56.23
    pod/k8s-etcd-1625baa2258e299969c1fa2c9c7765620957dc118eb300c508fadcc59f45133e-192.168.56.22
    pod/k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Pick the pod corresponding to the first member of the cluster and save its name in an environment variable. You can identify the node corresponding to the first master by running command k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0.

    # Find the first master
    
    $ k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0
    
    NAME            STATUS   ROLES    AGE   VERSION
    192.168.56.21   Ready    master   45m   v1.19.7
    
    # Pick the pod corresponding to the first member of the cluster and save its name in an environment  variable
    
    $ ETCD0_POD=k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Check that the list of members is available in that pod, and the list only includes one member:

    # Check that the list of members is available in that pod, and the list only includes one member
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    7d9376dfea92b837, started, etcd0, https://etcd0.kublr.local:2380, https://192.168.56.21:2379, false
    

    Check the health status of the member endpoint

    # Check the health status of the member endpoint
    
    $ k exec "${ETCD0_POD}" -- etcdctl endpoint health --cluster
    
    https://192.168.56.21:2379 is healthy: successfully committed proposal: took = 21.564722ms
    

    NB!!! DO NOT continue if any of the previous checks fails.

    Add the second member to the etcd cluster and save its etcd member ID.

    NB! If at this point or after a command prints the following output “Error from server: etcdserver: rpc not supported for learner”, just try to rerun the command.

    # Add the second members to the cluster
    
    $ k exec "${ETCD0_POD}" -- etcdctl member add etcd1 --learner --peer-urls=https://etcd1.kublr.local:2380
    
    Member e92c7bb13cdf265a added to cluster 4badfb8c1654715a
    
    ETCD_NAME="etcd1"
    ETCD_INITIAL_CLUSTER="etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd1.kublr.local:2380"
    ETCD_INITIAL_CLUSTER_STATE="existing"
    
    # Save the member ID
    
    $ ETCD1_MID="$(k exec "${ETCD0_POD}" -- etcdctl member list | grep etcd1 | cut -d, -f1)" ; echo ${ETCD1_MID}
    
    e92c7bb13cdf265a
    
    # Check the members status
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    1c0f0abe6ae60baa, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.4:2379, false
    e92c7bb13cdf265a, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.5:2379, true
    

    NB!!! DO NOT continue if not all members are marked as “started”. If this happens, remove the just added member from the etcd cluster using etcdctl member remove "${ETCD1_MID}" command.

    Promote the second member to full member. Promoting a member from learner status is possible only after the learner has caught up with the cluster, which may require some time. If the promotion is not successful on the first attempt, wait for some time and try again.

    # Promoting a member from learner status is possible only after the learner
    # has caught up with the cluster, which may require some time.
    # If the promotion is not successful on the first attempt, wait for some time
    # and try again.
    $ k exec "${ETCD0_POD}" -- etcdctl member promote "${ETCD1_MID}"
    
    Member e92c7bb13cdf265a promoted in cluster 4badfb8c1654715a
    

    Verify that both members are started and not in “learner” status (the last column in the printed table must show “false” for all members).

    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    1c0f0abe6ae60baa, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.4:2379, false
    e92c7bb13cdf265a, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.5:2379, false
    

    Now you are ready to add the 3rd member.

Addning the 3rd master

  1. Change cluster specification as follows:

    • Remove the --initial-cluster etcd switch:

      spec:
        ...
        master:
          kublrAgentConfig:
            kublr:
              etcd_flag:
                initial_cluster_state: '--initial-cluster-state=existing'
                # initial_cluster: '--initial-cluster=etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380'
      
  2. Run cluster update and wait until all new masters are available and green in the cluster console and the cluster is back to green healthy state.

  3. Add the 3rd master etcd; run the following commands from the cluster CLI

    Set kubectl alias for shorter commands:

    alias k='kubectl -n kube-system'
    

    List etcd pods:

    # List etcd pods
    
    $ k get pods -o name | grep etcd
    
    pod/k8s-etcd-01051b38137ab69ab204db17e586f37abdf968f01a997b568c2e00338fbdf4e8-192.168.56.23
    pod/k8s-etcd-1625baa2258e299969c1fa2c9c7765620957dc118eb300c508fadcc59f45133e-192.168.56.22
    pod/k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Pick the pod corresponding to the first member of the cluster and save its name in an environment variable. You can identify the node corresponding to the first master by running command k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0.

    # Find the first master
    
    $ k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0
    
    NAME            STATUS   ROLES    AGE   VERSION
    192.168.56.21   Ready    master   45m   v1.19.7
    
    # Pick the pod corresponding to the first member of the cluster and save its name in an environment  variable
    
    $ ETCD0_POD=k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Check that the list of members is available in that pod, and the list includes two members.

    Verify also that all members are marked as “started” and are not learners; learner members will contain “true” in the last column.

    # Check that the list of members is available in that pod, and the list includes two members
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    7d9376dfea92b837, started, etcd0, https://etcd0.kublr.local:2380, https://192.168.56.21:2379, false
    e92c7bb13cdf265a, started, etcd1, https://etcd1.kublr.local:2380, https://192.168.56.22:2379, false
    

    Check the health status of the member endpoint

    # Check the health status of the member endpoint
    
    $ k exec "${ETCD0_POD}" -- etcdctl endpoint health --cluster
    
    https://192.168.56.21:2379 is healthy: successfully committed proposal: took = 21.564722ms
    https://192.168.56.22:2379 is healthy: successfully committed proposal: took = 23.559718ms
    

    NB!!! DO NOT continue if any of the previous checks fails.

    Add the third member to the etcd cluster and save its etcd member ID.

    NB! If at this point or after a command prints the following output “Error from server: etcdserver: rpc not supported for learner”, just try to rerun the command.

    # Add the third members to the cluster
    
    $ k exec "${ETCD0_POD}" -- etcdctl member add etcd2 --peer-urls=https://etcd2.kublr.local:2380
    
    Member 77d313c27e421aa2 added to cluster 4badfb8c1654715a
    

    Verify that all three members are started and not in “learner” status (the last column in the printed table must show “false” for all members).

    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    1c0f0abe6ae60baa, started, etcd0, https://etcd0.kublr.local:2380, https://192.168.56.21:2379, false
    e92c7bb13cdf265a, started, etcd1, https://etcd1.kublr.local:2380, https://192.168.56.22:2379, false
    77d313c27e421aa2, started, etcd2, https://etcd2.kublr.local:2380, https://192.168.56.23:2379, false
    

    NB!!! DO NOT continue if not all members are marked as “started”. If this happens, remove the just added member from the etcd cluster using etcdctl member remove command.

Scaling up from 3 to 5 masters

When scaling up from 3 to 5 masters nodes, the procedure should be adjusted accordingly.

NB!!! Please note that if you want to scale up from 1 to 5 members, you MUST first follow the scaling procedure from 1 to 3 members, and then the scaling precedure from 3 to 5 members. Scaling from 1 to 5 members directly is not supported

Adding the 4th master

  1. Change cluster specification as follows:

    • Set etcd initial cluster state parameter to existing and specify 4 etcd members - similar to the procedure for scaling up from 1 to 3 nodes

      spec:
        ...
        master:
          kublrAgentConfig:
            kublr:
              etcd_flag:
                initial_cluster_state: '--initial-cluster-state=existing'
                initial_cluster: '--initial-cluster=etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380,etcd2=https://etcd2.kublr.local:2380,etcd3=https://etcd3.kublr.local:2380'
      

      NB!!! Use the etcd<N>.kublr.local addresses for --initial-cluster switch in all cases except when you used DNS names for masters rather than IP addresses. This is usually only possible for baremetal, vSphere, and/or VCD deployments, and possibly in certain experimantal configuration for other infrastructure providers.

    • Change the upgrade strategy to disable node draining and upgrading all nodes at once - same as for scaling up from 1 to 3 nodes.

      Upgrade strategy can be changed back to normal after scaling master group is complete.

    • Increase the number of master nodes from 3 to 5 in the cluster specification (in addition to common changes described in the previous section).

      For on-prem (“bring your ouw infrastructure”) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 5 # change from 3 to 5
          maxNodes:     5 # change from 3 to 5
          minNodes:     5 # change from 3 to 5
          
          locations: # for on-prem/BYOI clusters only ...
            -
              baremetal:
                hosts:
                  - address: 192.168.56.21
                  - address: 192.168.56.22
                  - address: 192.168.56.23
                  - address: 192.168.56.24 # specify additional master nodes' addresses
                  - address: 192.168.56.25 # specify additional master nodes' addresses
      

      For on-prem (vSphere) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 5 # change from 3 to 5
          maxNodes:     5 # change from 3 to 5
          minNodes:     5 # change from 3 to 5
          
          locations: # for on-prem/BYOI clusters only ...
            -
              vSphere:
                ipAddresses:
                  - 192.168.56.21
                  - 192.168.56.22
                  - 192.168.56.23
                  - 192.168.56.24 # specify additional master nodes' addresses
                  - 192.168.56.25 # specify additional master nodes' addresses
      

      For on-prem (vCloud Director) clusters add new masters’ addresses as follows:

      spec:
        ...
        master:
          ...
          initialNodes: 5 # change from 3 to 5
          maxNodes:     5 # change from 3 to 5
          minNodes:     5 # change from 3 to 5
      
          locations: # for on-prem/BYOI clusters only ...
            -
              vcd:
                ipAddresses:
                  - 192.168.56.21
                  - 192.168.56.22
                  - 192.168.56.23
                  - 192.168.56.24 # specify additional master nodes' addresses
                  - 192.168.56.25 # specify additional master nodes' addresses
      
  2. Run cluster update and wait until cluster is back to green healthy state.

  3. Add the fourth etcd member; run the following commands from the cluster CLI:

    Set kubectl alias for shorter commands:

    alias k='kubectl -n kube-system'
    

    List etcd pods:

    # List etcd pods
    
    $ k get pods -o name | grep etcd
    
    pod/k8s-etcd-6d1c63f3b37bacb0c334bbd9bde50e2f8d30b6e3863ae8779e992f5334820bc1-192.168.56.25
    pod/k8s-etcd-e3e7dbf9be2dff8f154852d7d2a602447049156e791dc5a7d33653436011ea99-192.168.56.24
    pod/k8s-etcd-01051b38137ab69ab204db17e586f37abdf968f01a997b568c2e00338fbdf4e8-192.168.56.23
    pod/k8s-etcd-1625baa2258e299969c1fa2c9c7765620957dc118eb300c508fadcc59f45133e-192.168.56.22
    pod/k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Pick the pod corresponding to the first member of the cluster and save its name in an environment variable. You can identify the node corresponding to the first master by running command k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0.

    # Find the first master
    
    $ k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0
    
    NAME            STATUS   ROLES    AGE   VERSION
    192.168.56.21   Ready    master   45m   v1.19.7
    
    # Pick the pod corresponding to the first member of the cluster and save its name in an environment  variable
    
    $ ETCD0_POD=k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Check that the list of members is available in that pod, and the list includes three members:

    # Check that the list of members is available in that pod, and the list includes three members
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    7d9376dfea92b837, started, etcd0, https://etcd0.kublr.local:2380, https://192.168.56.21:2379, false
    e92c7bb13cdf265a, started, etcd1, https://etcd1.kublr.local:2380, https://192.168.56.22:2379, false
    77d313c27e421aa2, started, etcd2, https://etcd2.kublr.local:2380, https://192.168.56.23:2379, false
    

    Check the health status of the member endpoints

    # Check the health status of the member endpoints
    
    $ k exec "${ETCD0_POD}" -- etcdctl endpoint health --cluster
    
    https://192.168.56.21:2379 is healthy: successfully committed proposal: took = 21.564722ms
    https://192.168.56.22:2379 is healthy: successfully committed proposal: took = 23.559718ms
    https://192.168.56.23:2379 is healthy: successfully committed proposal: took = 23.559718ms
    

    NB!!! DO NOT continue if any of the previous checks fails.

    Add the fourth member to the etcd cluster and save its etcd member ID.

    NB! If at this point or after a command prints the following output “Error from server: etcdserver: rpc not supported for learner”, just try to rerun the command.

    NB!!! DO NOT continue if not all members are marked as “started”. If this happens, remove the just added member from the etcd cluster using etcdctl member remove "${ETCD3_MID}" command.

    Add the fourth member to the cluster

    # Add the fourth members to the cluster
    
    $ k exec "${ETCD0_POD}" -- etcdctl member add etcd3 --learner --peer-urls=https://etcd3.kublr.local:2380
    
    Member 89454962046c3ac8 added to cluster c7b632003d33187c
    
    ETCD_NAME="etcd3"
    ETCD_INITIAL_CLUSTER="etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380,etcd2=https://etcd2.kublr.local:2380,etcd3=https://etcd3.kublr.local:2380"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd3.kublr.local:2380"
    ETCD_INITIAL_CLUSTER_STATE="existing"
    
    # Save the member ID
    
    $ ETCD3_MID="$(k exec "${ETCD0_POD}" -- etcdctl member list | grep etcd3 | cut -d, -f1)" ; echo ${ETCD3_MID}
    
    89454962046c3ac8
    
    # Check the members status
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    ec1b7e0bd0183a1, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.5:2379, false
    1dc9df69ddcd6ddd, started, etcd2, https://etcd2.kublr.local:2380, https://172.16.0.7:2379, false
    89454962046c3ac8, started, etcd3, https://etcd3.kublr.local:2380, https://172.16.0.9:2379, true
    db6964a4933a602d, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.8:2379, false
    

    NB!!! DO NOT continue if not all members are marked as “started”. If this happens, remove the just added member from the etcd cluster using etcdctl member remove "${ETCD1_MID}" command.

    Promote the fourth member to full member. Promoting a member from learner status is possible only after the learner has caught up with the cluster, which may require some time. If the promotion is not successful on the first attempt, wait for some time and try again.

    # Promoting a member from learner status is possible only after the learner
    # has caught up with the cluster, which may require some time.
    # If the promotion is not successful on the first attempt, wait for some time
    # and try again.
    $ k exec "${ETCD0_POD}" -- etcdctl member promote "${ETCD3_MID}"
    
    Member e92c7bb13cdf265a promoted in cluster 4badfb8c1654715a
    

    Verify that all members are started and not in “learner” status (the last column in the printed table must show “false” for all members).

    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    ec1b7e0bd0183a1, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.5:2379, false
    1dc9df69ddcd6ddd, started, etcd2, https://etcd2.kublr.local:2380, https://172.16.0.7:2379, false
    89454962046c3ac8, started, etcd3, https://etcd3.kublr.local:2380, https://172.16.0.9:2379, false
    db6964a4933a602d, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.8:2379, false
    

    Now you are ready to add the 5th member.

Addning the 5th master

  1. Change cluster specification as follows:

    • Remove the --initial-cluster etcd switch:

      spec:
        ...
        master:
          kublrAgentConfig:
            kublr:
              etcd_flag:
                initial_cluster_state: '--initial-cluster-state=existing'
                # initial_cluster: '--initial-cluster=etcd0=https://etcd0.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380'
      
  2. Run cluster update and wait until all new masters are available and green in the cluster console and the cluster is back to green healthy state.

  3. Add the 3rd master etcd; run the following commands from the cluster CLI

    Set kubectl alias for shorter commands:

    alias k='kubectl -n kube-system'
    

    List etcd pods:

    # List etcd pods
    
    $ k get pods -o name | grep etcd
    
    pod/k8s-etcd-01051b38137ab69ab204db17e586f37abdf968f01a997b568c2e00338fbdf4e8-192.168.56.23
    pod/k8s-etcd-1625baa2258e299969c1fa2c9c7765620957dc118eb300c508fadcc59f45133e-192.168.56.22
    pod/k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Pick the pod corresponding to the first member of the cluster and save its name in an environment variable. You can identify the node corresponding to the first master by running command k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0.

    # Find the first master
    
    $ k get nodes -l kublr.io/node-group=master -l kublr.io/node-ordinal=0
    
    NAME            STATUS   ROLES    AGE   VERSION
    192.168.56.21   Ready    master   45m   v1.19.7
    
    # Pick the pod corresponding to the first member of the cluster and save its name in an environment  variable
    
    $ ETCD0_POD=k8s-etcd-c6c3f19ea1c3126621056d9fc1906064f547fb62d22c89fd3f8d5bb372b287de-192.168.56.21
    

    Check that the list of members is available in that pod, and the list includes four members.

    Verify also that all members are marked as “started” and are not learners; learner members will contain “true” in the last column.

    # Check that the list of members is available in that pod, and the list includes four members
    
    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    ec1b7e0bd0183a1, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.5:2379, false
    1dc9df69ddcd6ddd, started, etcd2, https://etcd2.kublr.local:2380, https://172.16.0.7:2379, false
    89454962046c3ac8, started, etcd3, https://etcd3.kublr.local:2380, https://172.16.0.9:2379, false
    db6964a4933a602d, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.8:2379, false
    

    Check the health status of the member endpoints

    # Check the health status of the member endpoint
    
    $ k exec "${ETCD0_POD}" -- etcdctl endpoint health --cluster
    
    https://172.16.0.9:2379 is healthy: successfully committed proposal: took = 39.3228ms
    https://172.16.0.5:2379 is healthy: successfully committed proposal: took = 39.0407ms
    https://172.16.0.8:2379 is healthy: successfully committed proposal: took = 48.8998ms
    https://172.16.0.7:2379 is healthy: successfully committed proposal: took = 50.9285ms
    

    NB!!! DO NOT continue if any of the previous checks fails.

    Add the fifth member to the etcd cluster.

    NB! If at this point or after a command prints the following output “Error from server: etcdserver: rpc not supported for learner”, just try to rerun the command.

    # Add the fifth members to the cluster
    
    $ k exec "${ETCD0_POD}" -- etcdctl member add etcd4 --peer-urls=https://etcd4.kublr.local:2380
    
    Member 38fdfd4b05062243 added to cluster c7b632003d33187c
    
    ETCD_NAME="etcd4"
    ETCD_INITIAL_CLUSTER="etcd0=https://etcd0.kublr.local:2380,etcd2=https://etcd2.kublr.local:2380,etcd4=https://etcd4.kublr.local:2380,etcd3=https://etcd3.kublr.local:2380,etcd1=https://etcd1.kublr.local:2380"
    ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd4.kublr.local:2380"
    ETCD_INITIAL_CLUSTER_STATE="existing"
    

    Verify that all five members are started and not in “learner” status (the last column in the printed table must show “false” for all members).

    $ k exec "${ETCD0_POD}" -- etcdctl member list
    
    ec1b7e0bd0183a1, started, etcd0, https://etcd0.kublr.local:2380, https://172.16.0.5:2379, false
    1dc9df69ddcd6ddd, started, etcd2, https://etcd2.kublr.local:2380, https://172.16.0.7:2379, false
    38fdfd4b05062243, started, etcd4, https://etcd4.kublr.local:2380, https://172.16.0.10:2379, false
    89454962046c3ac8, started, etcd3, https://etcd3.kublr.local:2380, https://172.16.0.9:2379, false
    db6964a4933a602d, started, etcd1, https://etcd1.kublr.local:2380, https://172.16.0.8:2379, false
    

    Scaling master is complete.