5. Scaling

In this lab, we are going to show you how to scale applications on OpenShift. Furthermore, we show you how OpenShift makes sure that the number of requested Pods is up and running and how an application can tell the platform that it is ready to receive requests.

Note

This lab does not depend on previous labs. You can start with an empty Namespace.

Task 5.1: Scale the example application

Create a new Deployment in your Namespace. So again, lets define the Deployment using YAML in a file deployment_example-web-app.yaml with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: example-web-app
  name: example-web-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: example-web-app
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: example-web-app
    spec:
      containers:
        - image: quay.io/acend/example-web-python:latest
          name: example-web-app
          resources:
            limits:
              cpu: 100m
              memory: 128Mi
            requests:
              cpu: 50m
              memory: 128Mi

and then apply with:

oc apply -f deployment_example-web-app.yaml --namespace <namespace>

If we want to scale our example application, we have to tell the Deployment that we want to have three running replicas instead of one. Let’s have a closer look at the existing ReplicaSet:

oc get replicasets --namespace <namespace>

Which will give you an output similar to this:

NAME                            DESIRED   CURRENT   READY   AGE
example-web-app-86d9d584f8      1         1         1       110s

Or for even more details:

oc get replicaset <replicaset> -o yaml --namespace <namespace>

The ReplicaSet shows how many instances of a Pod are desired, current and ready.

Now we scale our application to three replicas:

oc scale deployment example-web-app --replicas=3 --namespace <namespace>

Check the number of desired, current and ready replicas:

oc get replicasets --namespace <namespace>

NAME                            DESIRED   CURRENT   READY   AGE
example-web-app-86d9d584f8      3         3         3       4m33s

Look at how many Pods there are:

oc get pods --namespace <namespace>

Which gives you an output similar to this:

NAME                                  READY   STATUS    RESTARTS   AGE
example-web-app-86d9d584f8-7vjcj      1/1     Running   0          5m2s
example-web-app-86d9d584f8-hbvlv      1/1     Running   0          31s
example-web-app-86d9d584f8-qg499      1/1     Running   0          31s

Note

OpenShift supports horizontal and vertical autoscaling .

As we changed the number of replicas with the oc scale deployment command, the example-web-app Deployment now differs from your local deployment_example-web-app.yaml file. Change your local deployment_example-web-app.yaml file to match the current number of replicas and update the value replicas to 3:

[...]
metadata:
  labels:
    app: example-web-app
  name: example-web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-web-app
[...]

Check for uninterruptible Deployments

Now we expose our application to the internet by creating a service and a route.

First the service:

oc expose deployment example-web-app --name="example-web-app" --port=5000 --namespace <namespace>

Then the route:

oc create route edge example-web-app --port 5000 --service example-web-app --namespace <namespace>

Let’s look at our Service. We should see all three corresponding Endpoints:

oc describe service example-web-app --namespace <namespace>

Name:              example-web-app
Namespace:         acend-test
Labels:            app=example-web-app
Annotations:       <none>
Selector:          app=example-web-app
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                172.30.89.44
IPs:               172.30.89.44
Port:              <unset>  5000/TCP
TargetPort:        5000/TCP
Endpoints:         10.125.4.70:5000,10.126.4.137:5000,10.126.4.138:5000
Session Affinity:  None
Events:            <none>

Scaling of Pods is fast as OpenShift simply creates new containers.

You can check the availability of your Service while you scale the number of replicas up and down in your browser: http://<route hostname>.

Note

You can find out the route’s hostname by looking at the output of oc get route.

Now, execute the corresponding loop command for your operating system in another console.

Linux:

URL=$(oc get routes example-web-app -o go-template="{{ .spec.host }}" --namespace <namespace>)
while true; do sleep 1; curl -s https://${URL}/pod/; date "+ TIME: %H:%M:%S,%3N"; done

Windows PowerShell:

while(1) {
  Start-Sleep -s 1
  Invoke-RestMethod https://<URL>/pod/
  Get-Date -Uformat "+ TIME: %H:%M:%S,%3N"
}

Scale from 3 replicas to 1. The output shows which Pod is still alive and is responding to requests:

example-web-app-86d9d584f8-7vjcj TIME: 17:33:07,289
example-web-app-86d9d584f8-7vjcj TIME: 17:33:08,357
example-web-app-86d9d584f8-hbvlv TIME: 17:33:09,423
example-web-app-86d9d584f8-7vjcj TIME: 17:33:10,494
example-web-app-86d9d584f8-qg499 TIME: 17:33:11,559
example-web-app-86d9d584f8-hbvlv TIME: 17:33:12,629
example-web-app-86d9d584f8-qg499 TIME: 17:33:13,695
example-web-app-86d9d584f8-hbvlv TIME: 17:33:14,771
example-web-app-86d9d584f8-hbvlv TIME: 17:33:15,840
example-web-app-86d9d584f8-7vjcj TIME: 17:33:16,912
example-web-app-86d9d584f8-7vjcj TIME: 17:33:17,980
example-web-app-86d9d584f8-7vjcj TIME: 17:33:19,051
example-web-app-86d9d584f8-7vjcj TIME: 17:33:20,119
example-web-app-86d9d584f8-7vjcj TIME: 17:33:21,182
example-web-app-86d9d584f8-7vjcj TIME: 17:33:22,248
example-web-app-86d9d584f8-7vjcj TIME: 17:33:23,313
example-web-app-86d9d584f8-7vjcj TIME: 17:33:24,377
example-web-app-86d9d584f8-7vjcj TIME: 17:33:25,445
example-web-app-86d9d584f8-7vjcj TIME: 17:33:26,513

The requests get distributed amongst the three Pods. As soon as you scale down to one Pod, there should be only one remaining Pod that responds.

Let’s make another test: What happens if you start a new Deployment while our request generator is still running?

oc rollout restart deployment example-web-app --namespace <namespace>

During a short period we won’t get a response:

example-web-app-86d9d584f8-7vjcj TIME: 17:37:24,121
example-web-app-86d9d584f8-7vjcj TIME: 17:37:25,189
example-web-app-86d9d584f8-7vjcj TIME: 17:37:26,262
example-web-app-86d9d584f8-7vjcj TIME: 17:37:27,328
example-web-app-86d9d584f8-7vjcj TIME: 17:37:28,395
example-web-app-86d9d584f8-7vjcj TIME: 17:37:29,459
example-web-app-86d9d584f8-7vjcj TIME: 17:37:30,531
example-web-app-86d9d584f8-7vjcj TIME: 17:37:31,596
example-web-app-86d9d584f8-7vjcj TIME: 17:37:32,662
# no answer
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:33,729
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:34,794
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:35,862
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:36,929
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:37,995
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:39,060
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:40,118
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:41,187

In our example, we use a very lightweight Pod. If we had used a more heavyweight Pod that needed a longer time to respond to requests, we would of course see a larger gap. An example for this would be a Java application with a startup time of 30 seconds:

example-spring-boot-2-73aln TIME: 16:48:25,251
example-spring-boot-2-73aln TIME: 16:48:26,305
example-spring-boot-2-73aln TIME: 16:48:27,400
example-spring-boot-2-73aln TIME: 16:48:28,463
example-spring-boot-2-73aln TIME: 16:48:29,507
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
 TIME: 16:48:33,562
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
 TIME: 16:48:34,601
 ...
example-spring-boot-3-tjdkj TIME: 16:49:20,114
example-spring-boot-3-tjdkj TIME: 16:49:21,181
example-spring-boot-3-tjdkj TIME: 16:49:22,231

It is even possible that the Service gets down, and the routing layer responds with the status code 503 as can be seen in the example output above.

In the following chapter we are going to look at how a Service can be configured to be highly available.

Uninterruptible Deployments

The rolling update strategy makes it possible to deploy Pods without interruption. The rolling update strategy means that the new version of an application gets deployed and started. As soon as the application says it is ready, OpenShift forwards requests to the new instead of the old version of the Pod, and the old Pod gets terminated.

Additionally, container health checks help OpenShift to precisely determine what state the application is in.

Basically, there are two different kinds of checks that can be implemented:

Liveness probes are used to find out if an application is still running
Readiness probes tell us if the application is ready to receive requests (which is especially relevant for the above-mentioned rolling updates)

These probes can be implemented as HTTP checks, container execution checks (the execution of a command or script inside a container) or TCP socket checks.

In our example, we want the application to tell OpenShift that it is ready for requests with an appropriate readiness probe.

Our example application has a health check context named health: http://${URL}/health

Task 5.2: Availability during deployment

Define the readiness probe on the Deployment using the following command:

oc set probe deploy/example-web-app --readiness --get-url=http://:5000/health --initial-delay-seconds=10 --timeout-seconds=1 --namespace <namespace>

The command above results in the following readinessProbe snippet being inserted into the Deployment:


...
containers:
  - image: quay.io/acend/example-web-python:latest
    imagePullPolicy: Always
    name: example-web-app
    readinessProbe:
      httpGet:
        path: /health
        port: 5000
        scheme: HTTP
      initialDelaySeconds: 10
      timeoutSeconds: 1
...

We are now going to verify that a redeployment of the application does not lead to an interruption.

Set up the loop again to periodically check the application’s response (you don’t have to set the $URL variable again if it is still defined):

URL=$(oc get routes example-web-app -o go-template="{{ .spec.host }}" --namespace <namespace>)
while true; do sleep 1; curl -s https://${URL}/pod/; date "+ TIME: %H:%M:%S,%3N"; done

Windows PowerShell:

while(1) {
  Start-Sleep -s 1
  Invoke-RestMethod https://<URL>/pod/
  Get-Date -Uformat "+ TIME: %H:%M:%S,%3N"
}

Restart your Deployment with:

oc rollout restart deployment example-web-app --namespace <namespace>

Self-healing

Via the Deployment definition we told OpenShift how many replicas we want. So what happens if we simply delete a Pod?

Look for a running Pod (status RUNNING) that you can bear to kill via oc get pods.

Show all Pods and watch for changes:

oc get pods -w --namespace <namespace>

Now delete a Pod (in another terminal) with the following command:

oc delete pod <pod> --namespace <namespace>

Observe how OpenShift instantly creates a new Pod in order to fulfill the desired number of running instances.

Last modified June 28, 2025: chore(deps): update docker.io/nginxinc/nginx-unprivileged docker tag to v1.29 (#644) (7a956e5)