5. Scaling
In this lab, we are going to show you how to scale applications on OpenShift. Furthermore, we show you how OpenShift makes sure that the number of requested Pods is up and running and how an application can tell the platform that it is ready to receive requests.
Note
This lab does not depend on previous labs. You can start with an empty Namespace.Task 5.1: Scale the example application
Create a new Deployment in your Namespace. So again, lets define the Deployment using YAML in a file deployment_example-web-app.yaml
with the following content:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: example-web-app
name: example-web-app
spec:
replicas: 1
selector:
matchLabels:
app: example-web-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: example-web-app
spec:
containers:
- image: quay.io/acend/example-web-python:latest
name: example-web-app
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 50m
memory: 128Mi
and then apply with:
oc apply -f deployment_example-web-app.yaml --namespace <namespace>
If we want to scale our example application, we have to tell the Deployment that we want to have three running replicas instead of one. Let’s have a closer look at the existing ReplicaSet:
oc get replicasets --namespace <namespace>
Which will give you an output similar to this:
NAME DESIRED CURRENT READY AGE
example-web-app-86d9d584f8 1 1 1 110s
Or for even more details:
oc get replicaset <replicaset> -o yaml --namespace <namespace>
The ReplicaSet shows how many instances of a Pod are desired, current and ready.
Now we scale our application to three replicas:
oc scale deployment example-web-app --replicas=3 --namespace <namespace>
Check the number of desired, current and ready replicas:
oc get replicasets --namespace <namespace>
NAME DESIRED CURRENT READY AGE
example-web-app-86d9d584f8 3 3 3 4m33s
Look at how many Pods there are:
oc get pods --namespace <namespace>
Which gives you an output similar to this:
NAME READY STATUS RESTARTS AGE
example-web-app-86d9d584f8-7vjcj 1/1 Running 0 5m2s
example-web-app-86d9d584f8-hbvlv 1/1 Running 0 31s
example-web-app-86d9d584f8-qg499 1/1 Running 0 31s
Note
OpenShift supports horizontal and vertical autoscaling .
As we changed the number of replicas with the oc scale deployment
command, the example-web-app
Deployment now differs from your local deployment_example-web-app.yaml
file. Change your local deployment_example-web-app.yaml
file to match the current number of replicas and update the value replicas
to 3
:
[...]
metadata:
labels:
app: example-web-app
name: example-web-app
spec:
replicas: 3
selector:
matchLabels:
app: example-web-app
[...]
Check for uninterruptible Deployments
Now we expose our application to the internet by creating a service and a route.
First the service:
oc expose deployment example-web-app --name="example-web-app" --port=5000 --namespace <namespace>
Then the route:
oc create route edge example-web-app --port 5000 --service example-web-app --namespace <namespace>
Let’s look at our Service. We should see all three corresponding Endpoints:
oc describe service example-web-app --namespace <namespace>
Name: example-web-app
Namespace: acend-test
Labels: app=example-web-app
Annotations: <none>
Selector: app=example-web-app
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.30.89.44
IPs: 172.30.89.44
Port: <unset> 5000/TCP
TargetPort: 5000/TCP
Endpoints: 10.125.4.70:5000,10.126.4.137:5000,10.126.4.138:5000
Session Affinity: None
Events: <none>
Scaling of Pods is fast as OpenShift simply creates new containers.
You can check the availability of your Service while you scale the number of replicas up and down in your browser: http://<route hostname>
.
Note
You can find out the route’s hostname by looking at the output of oc get route
.
Now, execute the corresponding loop command for your operating system in another console.
Linux:
URL=$(oc get routes example-web-app -o go-template="{{ .spec.host }}" --namespace <namespace>)
while true; do sleep 1; curl -s https://${URL}/pod/; date "+ TIME: %H:%M:%S,%3N"; done
Windows PowerShell:
while(1) {
Start-Sleep -s 1
Invoke-RestMethod https://<URL>/pod/
Get-Date -Uformat "+ TIME: %H:%M:%S,%3N"
}
Scale from 3 replicas to 1. The output shows which Pod is still alive and is responding to requests:
example-web-app-86d9d584f8-7vjcj TIME: 17:33:07,289
example-web-app-86d9d584f8-7vjcj TIME: 17:33:08,357
example-web-app-86d9d584f8-hbvlv TIME: 17:33:09,423
example-web-app-86d9d584f8-7vjcj TIME: 17:33:10,494
example-web-app-86d9d584f8-qg499 TIME: 17:33:11,559
example-web-app-86d9d584f8-hbvlv TIME: 17:33:12,629
example-web-app-86d9d584f8-qg499 TIME: 17:33:13,695
example-web-app-86d9d584f8-hbvlv TIME: 17:33:14,771
example-web-app-86d9d584f8-hbvlv TIME: 17:33:15,840
example-web-app-86d9d584f8-7vjcj TIME: 17:33:16,912
example-web-app-86d9d584f8-7vjcj TIME: 17:33:17,980
example-web-app-86d9d584f8-7vjcj TIME: 17:33:19,051
example-web-app-86d9d584f8-7vjcj TIME: 17:33:20,119
example-web-app-86d9d584f8-7vjcj TIME: 17:33:21,182
example-web-app-86d9d584f8-7vjcj TIME: 17:33:22,248
example-web-app-86d9d584f8-7vjcj TIME: 17:33:23,313
example-web-app-86d9d584f8-7vjcj TIME: 17:33:24,377
example-web-app-86d9d584f8-7vjcj TIME: 17:33:25,445
example-web-app-86d9d584f8-7vjcj TIME: 17:33:26,513
The requests get distributed amongst the three Pods. As soon as you scale down to one Pod, there should be only one remaining Pod that responds.
Let’s make another test: What happens if you start a new Deployment while our request generator is still running?
oc rollout restart deployment example-web-app --namespace <namespace>
During a short period we won’t get a response:
example-web-app-86d9d584f8-7vjcj TIME: 17:37:24,121
example-web-app-86d9d584f8-7vjcj TIME: 17:37:25,189
example-web-app-86d9d584f8-7vjcj TIME: 17:37:26,262
example-web-app-86d9d584f8-7vjcj TIME: 17:37:27,328
example-web-app-86d9d584f8-7vjcj TIME: 17:37:28,395
example-web-app-86d9d584f8-7vjcj TIME: 17:37:29,459
example-web-app-86d9d584f8-7vjcj TIME: 17:37:30,531
example-web-app-86d9d584f8-7vjcj TIME: 17:37:31,596
example-web-app-86d9d584f8-7vjcj TIME: 17:37:32,662
# no answer
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:33,729
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:34,794
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:35,862
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:36,929
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:37,995
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:39,060
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:40,118
example-web-app-f4c5dd8fc-4nx2t TIME: 17:37:41,187
In our example, we use a very lightweight Pod. If we had used a more heavyweight Pod that needed a longer time to respond to requests, we would of course see a larger gap. An example for this would be a Java application with a startup time of 30 seconds:
example-spring-boot-2-73aln TIME: 16:48:25,251
example-spring-boot-2-73aln TIME: 16:48:26,305
example-spring-boot-2-73aln TIME: 16:48:27,400
example-spring-boot-2-73aln TIME: 16:48:28,463
example-spring-boot-2-73aln TIME: 16:48:29,507
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
TIME: 16:48:33,562
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
TIME: 16:48:34,601
...
example-spring-boot-3-tjdkj TIME: 16:49:20,114
example-spring-boot-3-tjdkj TIME: 16:49:21,181
example-spring-boot-3-tjdkj TIME: 16:49:22,231
It is even possible that the Service gets down, and the routing layer responds with the status code 503 as can be seen in the example output above.
In the following chapter we are going to look at how a Service can be configured to be highly available.
Uninterruptible Deployments
The rolling update strategy makes it possible to deploy Pods without interruption. The rolling update strategy means that the new version of an application gets deployed and started. As soon as the application says it is ready, OpenShift forwards requests to the new instead of the old version of the Pod, and the old Pod gets terminated.
Additionally, container health checks help OpenShift to precisely determine what state the application is in.
Basically, there are two different kinds of checks that can be implemented:
- Liveness probes are used to find out if an application is still running
- Readiness probes tell us if the application is ready to receive requests (which is especially relevant for the above-mentioned rolling updates)
These probes can be implemented as HTTP checks, container execution checks (the execution of a command or script inside a container) or TCP socket checks.
In our example, we want the application to tell OpenShift that it is ready for requests with an appropriate readiness probe.
Our example application has a health check context named health: http://${URL}/health
Task 5.2: Availability during deployment
Define the readiness probe on the Deployment using the following command:
oc set probe deploy/example-web-app --readiness --get-url=http://:5000/health --initial-delay-seconds=10 --timeout-seconds=1 --namespace <namespace>
The command above results in the following readinessProbe
snippet being inserted into the Deployment:
...
containers:
- image: quay.io/acend/example-web-python:latest
imagePullPolicy: Always
name: example-web-app
readinessProbe:
httpGet:
path: /health
port: 5000
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
...
We are now going to verify that a redeployment of the application does not lead to an interruption.
Set up the loop again to periodically check the application’s response (you don’t have to set the $URL
variable again if it is still defined):
URL=$(oc get routes example-web-app -o go-template="{{ .spec.host }}" --namespace <namespace>)
while true; do sleep 1; curl -s https://${URL}/pod/; date "+ TIME: %H:%M:%S,%3N"; done
Windows PowerShell:
while(1) {
Start-Sleep -s 1
Invoke-RestMethod https://<URL>/pod/
Get-Date -Uformat "+ TIME: %H:%M:%S,%3N"
}
Restart your Deployment with:
oc rollout restart deployment example-web-app --namespace <namespace>
Self-healing
Via the Deployment definition we told OpenShift how many replicas we want. So what happens if we simply delete a Pod?
Look for a running Pod (status RUNNING
) that you can bear to kill via oc get pods
.
Show all Pods and watch for changes:
oc get pods -w --namespace <namespace>
Now delete a Pod (in another terminal) with the following command:
oc delete pod <pod> --namespace <namespace>
Observe how OpenShift instantly creates a new Pod in order to fulfill the desired number of running instances.