Just Paste Me

Scaling Dedicated Game Servers With Kubernetes: Part 4 - Scaling Down

This is part four of a fivefour-part series on scaling game servers with Kubernetes.

In the previous three posts, we hosted our game servers on Kubernetes, measured and limited their resource usage, and scaled up the nodes in our cluster based on that usage. We now need to address the more difficult problem: scaling down nodes in our cluster when resources are no longer being used and ensuring that in-progress gaming is not interrupted if a node gets deleted.

Scaling down nodes within a cluster may seem complicated on the surface. Each game server stores the current state of the game in memory. Multiple game clients can also be connected to a specific game server that is playing a given game. The possibility of removing arbitrary nodes could cause active players to be disconnected - and that can lead to them becoming angry. Therefore, we can only remove nodes from a cluster when a node is empty of dedicated game servers.

This means that managed autoscaling cannot be used on Google Kubernetes Engine or similar. To quote the documentation for the GKE autoscaler "Cluster autoscaler assumes that all replicated Pods can be restarted on some other node..." - which in our case is definitely not going to work, since it could easily delete nodes that have active players on them.

That being said, when looking at this situation more closely, we discover that we can break this down into three separate strategies that when combined together make scaling down a manageable problem that we can implement ourselves:

1. Group game servers together to avoid fragmentation across the cluster 2. If the configured buffer is greater than the CPU capacity, you can cordon nodes. Delete a cordoned node from the cluster once all the games on the node have exited

Let's look at each of these detail.

Grouping Game Servers Together in the Cluster

We want to prevent fragmentation of the game servers across the cluster. If we do, we won't end with a small but still functioning set of servers across multiple nodes. This will stop those nodes being shut down and reclaiming resources.

This means that we don’t want to have a scheduling pattern that creates random game server Pods across our cluster.

Instead, we want our game server Pods to be packed as tightly as possible.

To group our game servers together, we can take advantage of Kubernetes Pod PodAffinity configuration with the PreferredDuringSchedulingIgnoredDuringExecution option. This allows us to tell Pods we prefer to group them according to the hostname of the current node. This basically means that Kubernetes will prefer a dedicated server Pod to be placed on a node with a dedicated server Pod.

In an ideal world we would want a dedicated server Pod to be scheduled at the node with the most dedicated server Pods. This is provided that the node has sufficient CPU resources. While we could certainly do this if needed, we will use the PodAffinity scheduler to simplify our demo. This combination of methods is sufficient for our purposes considering the short length of the games and that we will soon be adding (and explaining!) cordoning points.

When we add PodAffinity to the previous post's setup, we get the following. This tells Kubernetes Kubernetes how to place pods with labels sessions: game on each node whenever possible.

Cordoning Nodes

Now that we have all our game servers in the cluster, we can talk about "cordoning nosdes". What does cordoning notdes really mean? Kubernetes allows you to tell the scheduling engine: "Hey scheduler. Please don't schedule anything new here." This ensures that no new Pods are scheduled on the node. This is sometimes referred to simply as marking a non-scheduled node in the Kubernetes documentation.

If you focus on the section "s.bufferCount" in the code below, you will see that we request to cordon nodes when the CPU buffer we have is greater than the amount we have set as our need. We've stripped some parts out for brevity, but you can see the original here.

You can also see the code above that we can uncorden all available cordoned cluster nodes if we fall below the configured CPU buffer. This is faster than adding another node. It's important to check for any cordoned nodes before creating a new node. We also have a delay that controls how long it takes before a cordoned nede is deleted (you will see the source here). This helps to limit unnecessary thrashing in creating and deleting cluster nodes.

This is a good place to start. We want to cordon nodes only if they have the fewest game server pods. This is because they are more likely to be empty as the game session ends.

Thanks to the Kubernetes API, it's relatively straightforward to count the number of game server Pods on each Node, and sort them in ascending order. We can then do arithmetic to see if we are still above the CPU buffer if each of the nodes is cordoned. If the answer is yes, we can cordon those available nodes.

Removing Nodes

Now that we have nodes in our clusters being cordoned, it is just a matter of waiting until the cordoned node is empty of game server Pods before deleting it. The code below also ensures that node counts do not drop below a pre-defined minimum to provide a baseline for capacity within the cluster.

This is clearly evident in the code and in its original context:

We've successfully containerised our game servers, scaled them up as demand increases, and now scaled our Kubernetes cluster down, so we don't have to pay for underutilised machinery - all powered by the APIs and capabilities that Kubernetes makes available out of the box. Although it would require more work in order to make this a production system, you can already take advantage the many building blocks.

Before we get to the end, I want to apologize for the delay in producing the fourth installment of this series. You may have seen the announcement and realized that I spent a lot more time developing and publishing Agones, an open source, commercialized version of this series about running Kubernetes-based game servers.
EVINA

This will be the final installment of this series. I had already done the work to implement scaling before I started on Agones. Instead of building new functionality for global cluster management on Paddle Soccer I'm going focus my efforts on Agones and bring it up to its full 1.0 production-ready milestone.

I'm excited about Agones' future. Please visit the GitHub repository and join the Slack. Follow us on Twitter and sign up for the mailing list. We are actively looking for more contributors and would love to have your help.

You can also reach me via Twitter if you have any questions or comments. You can also see my presentation at GDC and GCAP from 2017 on this topic, as well as check out the code in GitHub.

Created: 20/08/2022 09:41:08

Page views: 44

CREATE NEW PAGE