GKE Scheduler Autoscaling

Published by :Maurício Santana Gonçalves
Date :03 Jan, 2022
Category : Virtualization

How to lower the cost of kubernetes GKE by killing nodes in specific time windows without using any external tool or API.

The problem

Our problem arises when we have a test environment (QA) that is only used during business hours and being connected 24x7 generates an unnecessary cost.

Proposed solution

Shutting down the test environment outside office hours, eliminating nodes from our cluster. Arriving close to the start of business hours, these nodes must be created and the test environment must return to operation.

Dependencies for this solution

For this to be possible we need to have three configuration points:

1- A separate “node pool” that can be eliminated without affecting production. To make this configuration more effective you can include Taints and Tolerantios.
2- A common label for all deploys/sts that may be shut down at this time. Example: “scheduler=comercial”
3- Know the list of namespaces that contain the deploys/sts that will be deactivated.

Previous implementation steps

We need to make sure the nodes and deploys are configured correctly.

Node-pool and deployments and/or statefulsets validation

Let’s first list our node pools and validate that our environment is prepared:

> gcloud container node-pools list --cluster us-gole-01
NAME                               MACHINE_TYPE       DISK_SIZE_GB  NODE_VERSION
pool-main                          c2-standard-8      100           1.20.10-gke.1600
pool-qa                            e2-standard-8      80            1.20.10-gke.1600

If the node-pool does not exist, it is necessary to create this node pool with the corresponding number of nodes, which can only be one, and add the taint:

Creating node pool with taint

If the node-pool already exists but does not have the necessary taint, then use the gcloud feature still in beta to add the taint:

> gcloud beta container node-pools update pool-qa --node-taints='scheduler=comercial:NoSchedule'

Now let’s validate that all the deploys we want contain the necessary labels.

In this example we won’t have statefulset in our list, but logically it fits perfectly.

> kubectl -n namespace-a get,sts deploy -l scheduler=comercial
NAMESPACE              NAME                                                         READY   UP-TO-DATE   AVAILABLE   AGE
namespace-a            deployment.apps/gole-sonarcube                               1/1     1            1           34d
namespace-a            deployment.apps/gole-redis                                   1/1     1            1           34d
namespace-a            deployment.apps/gole-nodejs-app                              1/1     1            1           34d

> kubectl -n namespace-a get,sts deploy -l scheduler=comercial
NAMESPACE              NAME                                                         READY   UP-TO-DATE   AVAILABLE   AGE
namespace-b            deployment.apps/gole-netbox-worker                           1/1     1            1           34d
namespace-b            deployment.apps/gole-netbox                                  1/1     1            1           7d18h

Adding tolerations in deploy

We need to make sure the deploys are configured correctly

❯ kubectl -n namespace-a get deploy -l scheduler=comercial -o=jsonpath='{.spec.template.spec.tolerations}'

The output should be an object like this below times the number of existing deploys. They will all be within the same list, but looking at the list it is possible to observe easily.

{"effect":"PreferNoSchedule","key":"scheduler","operator":"Equal","value":"comercial"}

Here is an example of where tolerations should be configured:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gole-app
  labels:
    scheduler: comercial
    app: gole-app
    tier: api-extended
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      scheduler: comercial
      app: gole-app
      tier: api-extended
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        scheduler: comercial
        app: gole-app
        tier: api-extended
    spec:
      tolerations:
      - key: "scheduler"
        operator: "Equal"
        value: "comercial"
        effect: "PreferNoSchedule"
      containers:

Final validation

Do we have labels on deployments? yes
Do we have a node pool with a taint configured? yes
Do we have tolerances set in our deployments? yes

So now we can get started.

Implementation

For the implementation of the solution we will need to create some resources:

Secret key.json

Our secret will be the file renamed to key.json generated in service account, keys inside IAM.

Logically this system user should have permission to administer the kubernetes cluster using the gcloud sdk.

key.json

> cat key.json
{
  "type": "service_account",
  "project_id": "gole-autoscaling-sample",
  "private_key_id": "asdfasdfasdfasdasdfasdasdf....",
  "private_key": "-----BEGIN PRIVATE KEY-----\ndasdfasdf393993..asdfsf.aksdflksjdflksjfkjafljasdjkf....Ijr7ZCBgpbQrDH\nvNUw/JxaVbLtpvy3KSmYpjGfKnHFs+wPQi+NFmwrdOZHvKjdtRNxHvPqgWNxCSAh\nMwEB8cKs0dzif1Hbg7EtYrZOR8g7LZrTD3c4lTsahMyI9x3kN0aCe5QXDXvtPEJ1\n3s5XFBriQc1tmHwMEV4VW8s=CONTINUA....\n-----END PRIVATE KEY-----\n",
  "client_email": "infra@golesuite.com",
  "client_id": "11223344556677889900",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/blablabla"
}

Let’s then add this file as a secret in a specific namespace for our automation.

> kubectl -n devops create secret generic gcloud-key --from-file=./key.json

Checking the created key:

> kubectl -n devops get secret gcloud-key -o=jsonpath='{.data.key\.json}' | base64 -d

Note: Ensure that only administrators can view the content of the used namespace. In this case we will use the devops namespace

Creating our Cronjob

Let’s now create a yaml file that uses kind Cronjob.

Our cronjob is set to:

Start at 07:30h from Monday to Friday.
Shut down at 19:00 from Monday to Friday.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  labels:
    role: devops
    owner: gole
  name: qa-env-shutdown
  namespace: devops
spec:
  schedule: "00 22 * * 1,2,3,4,5"
  concurrencyPolicy: Forbid
  jobTemplate:
    metadata:
      labels:
        role: devops
        owner: gole
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: gke-operator
            image: alpine
            command:
              - "/bin/sh"
            args:
              - -c
              - "apk add --no-cache curl bash ; \
                curl -O https://raw.githubusercontent.com/golesuite/gcloud-gke-scheduling/main/alpine-gcloud.sh ; \
                chmod +x ./alpine-gcloud.sh ; \
                ./alpine-gcloud.sh"
            imagePullPolicy: IfNotPresent
            volumeMounts:
              - mountPath: /etc/localtime
                name: tz-config
              - name: gcloud-key
                mountPath: /etc/gcloud/
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            env:
            - name: SCALE_DEPLOY_NUMBER
              value: "0"
            - name: SCALE_STS_NUMBER
              value: "0"
            - name: SCALE_NODES_NUMBER
              value: "0"
            - name: GCLOUD_ZONE
              value: "southamerica-east1-a"
            - name: CLUSTER_NAME
              value: "br-gole-01"
            - name: SCHEDULER_LABEL
              value: "sheduler=comercial""
            - name: SCHEDULER_POOL
              value: "pool-qa"
            - name: PROJECT_ID
              value: "gole"
          volumes:
          - name: tz-config
            hostPath:
              path: /usr/share/zoneinfo/America/Sao_Paulo
          - name: gcloud-key
            secret:
              secretName: gcloud-key
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  labels:
    role: devops
    owner: gole
  name: qa-env-startup
  namespace: devops
spec:
  schedule: "30 10 * * 1,2,3,4,5"
  concurrencyPolicy: Forbid
  jobTemplate:
    metadata:
      labels:
        role: devops
        owner: gole
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: gke-operator
            image: alpine
            command:
              - "/bin/sh"
            args:
              - -c
              - "apk add --no-cache curl bash ; \
                curl -O https://raw.githubusercontent.com/golesuite/gcloud-gke-scheduling/main/alpine-gcloud.sh ; \
                chmod +x ./alpine-gcloud.sh ; \
                ./alpine-gcloud.sh"
            imagePullPolicy: IfNotPresent
            volumeMounts:
              - mountPath: /etc/localtime
                name: tz-config
              - name: gcloud-key
                mountPath: /etc/gcloud/
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            env:
            - name: SCALE_DEPLOY_NUMBER
              value: "1"
            - name: SCALE_STS_NUMBER
              value: "1"
            - name: SCALE_NODES_NUMBER
              value: "1"
            - name: GCLOUD_ZONE
              value: "southamerica-east1-a"
            - name: CLUSTER_NAME
              value: "br-gole-01"
            - name: SCHEDULER_LABEL
              value: "scheduler=comercial""
            - name: SCHEDULER_POOL
              value: "pool-qa"
            - name: PROJECT_ID
              value: "gole"
          volumes:
          - name: tz-config
            hostPath:
              path: /usr/share/zoneinfo/America/Sao_Paulo
          - name: gcloud-key
            secret:
              secretName: gcloud-key

You can see that our cronjob uses two external mount points:

tz-config
gcloud-key

Our tz-config is a pointer to the zoneinfo file found on any OS. Linux, in this case we are communicating to our cronjob that your time zone is America/Sao_Paulo.

Our gcloud-key uses the secret model, and this is where we will link to the secret created in the previous step.

cronjob ENV Variables

ENV Variables	Required	example value
SCALE_DEPLOY_NUMBER	X	0
SCALE_STS_NUMBER	X	0
SCALE_NODES_NUMBER	X	0
PROJECT_ID	X	gole
CLUSTER_NAME	X	br-gole-01
GCLOUD_ZONE	X	southamerica-east1-a
SCHEDULER_LABEL	X	scheduler=comercial
SCHEDULER_POOL	X	pool-qa

Before creating and applying the cronjob file we must change the value of the variables.

This is located within env:

            env:
            - name: SCALE_DEPLOY_NUMBER
              value: "1"
            - name: SCALE_STS_NUMBER
              value: "1"
            - name: SCALE_NODES_NUMBER
              value: "1"
            - name: GCLOUD_ZONE
              value: "southamerica-east1-a"
            - name: CLUSTER_NAME
              value: "br-gole-01"
            - name: SCHEDULER_LABEL
              value: "scheduler=comercial""
            - name: SCHEDULER_POOL
              value: "pool-qa"
            - name: PROJECT_ID
              value: "gole"

Once all the steps have been followed, we are ready to apply our cronjob.

I would just like to tell you where the commands are that actually do the work of bringing down and uploading the environment, since all we’ve done so far is just preparing the infrastructure.

What commands are we going to run?

The commands that will be executed to perform the autoscaling task are described in this git project (it’s the same project):

The alpine-gcloud.sh file is being called within Cronjob in the commands session.

The content of the file contains the sequence of commands that will do our job.

There are two main steps:

1- shutdown.
- Scale the deploys that contain the declared labels to 0
- Scale node pool nodes containing taints to 0
2- startup
- Scale deploys containing declared labels to 1 or more
- Scale node pool nodes containing taints to 1 or more

Everything else that we create contains the structure that we will use so that these commands can be executed from within our cluster.

Applying the cronjob.

Since we’ve already saved our cronjob file locally, and we’ve already modified the variables, times and namespaces, let’s apply them.

Note that the cronjob execution times point to 3h more, this is because the control APIs of the GKE nodes use UTC time. Since TZ values for cronjobs are still in beta, the safest thing is to add an extra 3h and we’ll arrive at the correct time.

That is, if you want 8h and you are in a city GMT -3, then 8 + 3 = 11.

To start we will have (07:30h):

  schedule: "30 10 * * 1,2,3,4,5"

To turn off (19:00h):

  schedule: "00 22 * * 1,2,3,4,5"

> kubectl apply -f cronjob.yaml
> kubectl -n devops get cronjob

When it runs we will have a job running:

> kubectl -n devops get job

Logically, the job starts a pod and with that we can follow the logs.

Tests

Change the times so that they run at times when you are working, in order to validate that the processes run correctly.

Final considerations

This article includes:

kubernetes Google GKE
gcloud
node-pools
taints
auto-scaling
Cronjobs
container commands
container env variables

There are many ways to create and administer a Kubernetes cluster, so don’t limit yourself to creating different solutions or ways to apply your solutions.

Use and abuse the resources of the APIs.

Need help?

Access our page Contact and chat with us. We will be happy to serve you.

Success!