Kublr Custom Metric Server

Overview

Kubernetes Horizintal Pod Autoscaler (HPA) supports automatically scale the number of pods in a replication controller, deployment or replica set based on custom metrics. Kublr monitoring system provides metrics services for HPA.

Horizontal Pod Autoscaler

Scale UP

You can see autoscaling algorithm at the link: https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/horizontal-pod-autoscaler.md#autoscaling-algorithm

Scale Down

After calculating new replica counts, some pods marks for delete.

Pod termination process (https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods):

User sends command to delete Pod, with default grace period (30s)
The Pod in the API server is updated with the time beyond which the Pod is considered “dead” along with the grace period.
Pod shows up as “Terminating” when listed in client commands
(simultaneous with 3) When the Kubelet sees that a Pod has been marked as terminating because the time in 2 has been set, it begins the pod shutdown process.

If the pod has defined a preStop hook, it is invoked inside of the pod. If the preStop hook is still running after the grace period expires, step 2 is then invoked with a small (2 second) extended grace period.
The processes in the Pod are sent the TERM signal.

(simultaneous with 3) Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
When the grace period expires, any processes still running in the Pod are killed with SIGKILL.
The Kubelet will finish deleting the Pod on the API server by setting grace period 0 (immediate deletion). The Pod disappears from the API and is no longer visible from the client.

Kubernetes parameters for HPA features

Parameters defines in Kublr custom-specifiacation on cluster create procedure or in /etc/kublr/daemon.yaml file on master node and in CloudFormation template.

horizontal-pod-autoscaler-sync-period: 30s - How often does the controller Manager queries the resource usage metrics that are specified in each definition HorizontalPodAutoscaler.
horizontal-pod-autoscaler-downscale-delay: 5m30s - the value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has complete
horizontal-pod-autoscaler-upscale-delay: 30s - the value for this option is a duration that specifies how long the autoscaler has to wait before another upscale operation can be performed after the current one has completed.

Custom Metrics Server for HPA

In Kublr monitoring system implemented metrics server for HPA.

Custom Metrics Server supplies metrics from prometheus server (e.g. FPMpool monitoring metrics)

How do I create a Cluster with Custom Metrics server?

Go to the Cluster’s page and click the Add Cluster button.

Select your Provider.
Fill the required fields (e.g. Credentials).
Select Instance Type for Nodes more than default, for example t2.2xlarge for AWS or Standard_A8_v2 for Azure.
Click “Customize cluster specification” button:
Change Kubernetes parameters for HPA features if needed:

Parameters for HPA features

kind: Cluster
metadata:
  ownerReferences: []
  name: cluster-with-hpa
  space: default
spec:
  master:
    kublrAgentConfig:
      kublr:
        kube_controller_manager_flag:
          horizontal_pod_autoscaler_downscale_delay: '--horizontal-pod-autoscaler-downscale-delay=5m30s'
          horizontal_pod_autoscaler_sync_period: '--horizontal-pod-autoscaler-sync-period=30s'
          horizontal_pod_autoscaler_upscale_delay: '--horizontal-pod-autoscaler-upscale-delay=30s'
...

Enable Custom Metrics server in features - monitoring section:

Add custom metrics

Kind: Cluster
metadata:
  ownerReferences: []
  name: cluster-with-hpa
  space: default
spec:
...
  features:
    monitoring:
      enabled: true
      values:
        customMetricsServer:
          enabled: true
...

Then click Create cluster.

Play with HPA

Metrics server API

To demonstrate a HorizontalPodAutoscaler, you will first start a Deployment that runs a container using the hpa-example image, and expose it as a Service using the following manifest. To do so, run the following command:

$ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

Warning: Compatibility Issue with hpa-example Image on ARM Instances. The hpa-example image provided by Docker Hub is not compatible with ARM instances. As a result, launching php-apache for demonstration on ARM instances using the hpa-example image will not be successful. Please ensure that you are using a compatible image for ARM instances if you intend to demonstrate the Horizontal Pod Autoscaler functionality.

Now that the server is running, we will create the autoscaler using kubectl autoscale.

$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

We may check the current status of autoscaler by running:

$ kubectl get hpa
NAME         REFERENCE                     TARGET    MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   0% / 50%  1         10        1          18s

Increase load

Now, we will see how the autoscaler reacts to increased load. We will start a container, and send an infinite loop of queries to the php-apache service (please run it in a different terminal):

$ kubectl run -i --tty load-generator --image=busybox /bin/sh

Hit enter for command prompt $ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done Within a minute or so, we should see the higher CPU load by executing:

$ kubectl get hpa
NAME         REFERENCE                     TARGET      CURRENT   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   305% / 50%  305%      1         10        1          3m

Here, CPU consumption has increased to 305% of the request. As a result, the deployment was resized to 7 replicas:

$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   7         7         7            7           19m

Stop load

We will finish our example by stopping the user load.

In the terminal where we created the container with busybox image, terminate the load generation by typing + C.

Then we will verify the result state (after a minute or so):

$ kubectl get hpa
NAME         REFERENCE                     TARGET       MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache/scale   0% / 50%     1         10        1          11m

$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   1         1         1            1           27m

Here CPU utilization dropped to 0, and so HPA autoscaled the number of replicas back down to 1.

Custom metrics server API

In some cases manual checks custom metric API requires. To do that please use next rules and examples:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/phpfpm_active_processes/"

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/*/rabbitmq_queue_messages/" | jq .

To retrieve the given metric for the given non-namespaced object (e.g. Node, PersistentVolume):

/{object-type}/{object-name}/{metric-name...}

To retrieve the given metric for all non-namespaced objects of the given type:

/{object-type}/*/{metric-name...}

To retrieve the given metric for all non-namespaced objects of the given type matching the given label selector:

/{object-type}/*/{metric-name...}?labelSelector=foo

To retrieve the given metric for the given namespaced object:

/namespaces/{namespace-name}/{object-type}/{object-name}/{metric-name...}

To retrieve the given metric for all namespaced objects of the given type:

/namespaces/{namespace-name}/{object-type}/*/{metric-name...}

To retrieve the given metric for all namespaced objects of the given type matching the given label selector:

/namespaces/{namespace-name}/{object-type}/*/{metric-name...}?labelSelector=foo

To retrieve the given metric which describes the given namespace:

/namespaces/{namespace-name}/metrics/{metric-name}

Links

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/metrics.md https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/