Explore Kubernetes and Azure Low-priority VMs on Virtual Machine Scale Sets with kubeadm | Aaron|MSFT

Explore Kubernetes and Azure Low-priority VMs on Virtual Machine Scale Sets with kubeadm

Update: I recently contributed support for Low-priority VMs to Azure Container Service Engine (acs-engine) (0.18+ with k8s 1.10+), which is a great option for production clusters. You can find an example here. Blog post to follow!

There many great ways to run Kubernetes on Azure. One could choose the fully-managed Azure Container Service (AKS), the open source Azure Container Service Engine (acs-engine) that powers it, complemented by the Virtual Kubelet, and serverless Azure Container Instances (ACI) which I covered in my previous post.

Another great way to explore Kubernetes is via its built-in kubeadm toolkit which will help you bootstrap a cluster on Azure Virtual Machines, or indeed anywhere. This method for bootstrapping a cluster is great for test/dev and also very portable to on-premises environments including virtual machines, bare metal, or even a Raspberry Pi!

After creating a VM and initializing the master node using kubeadm init it would be possible to create individual VMs for each agent node and join them using the kubeadm join command. However, we can also use Azure Virtual Machine Scale Sets for our agent nodes which enables us to deploy and manage a set of identical, auto-scaling virtual machines and configure them via the same cloud-init workflow we would use for a stand-alone Virtual Machine.

The launch of Low-priority VMs on Virtual Machine Scale Sets (Preview) opens up an exciting opportunity to pair Kubernetes with a pool of Low-priority nodes with an up to 80% discount. Low-priority VMs are especially powerful when paired with Kubernetes Jobs for run-to-completion workloads. Previously only available via Azure Batch they allow us to “…take advantage of our unutilized capacity at a significant cost savings. At any point in time when Azure needs the capacity back, the Azure infrastructure will evict low-priority VMs. Therefore, low-priority VMs are great for workloads that can handle interruptions like batch processing jobs, dev/test environments, large compute workloads, and more”.

For this example we will need an Azure Account, bash, and the Azure CLI which is available via Cloud Shell, Docker, Linux, Mac or Windows (where bash is also available via the Windows Subsystem for Linux).

We’ll start with two cloud-init files, cloud-init-master.sh, and cloud-init-node.sh. In our case these are plain bash scripts. In these files we will:

  • [master & agent] Use apt-get to install docker, kubelet, kubeadm and kubectl.
  • [master] Run kubeadm init with a pre-generated token which we’ll use later use for kubeadm join.
  • [master] Copy the kubeconfig from /etc/kubernetes/admin.conf to /home/kubeconfig so we can access it later from any user.
  • [master] Install the Calico pod network.
  • [agent] Run kubeadm join with the pre-generated token we used for kubeadm init.

Download cloud-init-master.sh and cloud-init-node.sh locally so they can be passed via the –custom-data flag when we create the Virtual Machine for our master node as well as the Virtual Machine Scale Set for our agent nodes using this bash snippet:

# 1. azure
# --------

RESOURCE_GROUP='180300-k8s-vmss'
LOCATION='eastus'
IMAGE='UbuntuLTS'
MASTER_SKU='Standard_D1_v2'
AGENT_SKU='Standard_D1_v2'

az group create -g $RESOURCE_GROUP -l $LOCATION

az vm create -g $RESOURCE_GROUP -n 'linux1' \
  --size $MASTER_SKU \
  --image $IMAGE \
  --public-ip-address-dns-name 'ip1-'$RESOURCE_GROUP \
  --vnet-name vnet1 \
  --subnet subnet1 \
  --custom-data cloud-init-master.sh \
  --generate-ssh-keys

az vmss create -g $RESOURCE_GROUP -n vmss0 \
  --vm-sku $AGENT_SKU \
  --image UbuntuLTS \
  --public-ip-address-dns-name 'lb1-'$RESOURCE_GROUP \
  --upgrade-policy-mode automatic \
  --instance-count 2 \
  --vnet-name vnet1 \
  --subnet subnet1 \
  --custom-data cloud-init-node.sh \
  --priority Low \
  --generate-ssh-keys

Once that is up and running we can SSH into our master node where we can use kubectl and the kubeconfig on the master node itself:

# 2. master node (vm)
# -------------------

# ssh into master node
ssh $USER'@ip'$TMP_I'-'$RESOURCE_GROUP'.'$LOCATION'.cloudapp.azure.com'

sudo chown $(id -u):$(id -g) /home/kubeconfig
export KUBECONFIG=/home/kubeconfig

kubectl get nodes

We can use the az vmss get-instance-view command to see the status of our Agent Nodes. Alternatively we can SSH into one of the instances (in this case the first one) in the scale set. Each Virtual Machine in a Virtual Machine Scale Set is optionally exposed to the outside world via a NAT rule that routes a high port (50000 + our Instance ID) to port 22 of a Linux instance.

Finally, as bootstrapping the agent node can take some time, we can confirm that the cloud-init process has completed by looking for the /tmp/hello.txt file that gets created upon completion and/or take a look at the cloud-init.log and cloud-init-output.log files.

# 3. agent nodes (vmss)
# ---------------------

# check the status of your instances
az vmss get-instance-view -g $RESOURCE_GROUP -n vmss0 --instance-id '*'

# ssh into vmss node
INSTANCE_ID=$(az vmss list-instances -g $RESOURCE_GROUP -n vmss0 | jq -r .[0].instanceId)
ssh $USER'@lb1-'$RESOURCE_GROUP'.'$LOCATION'.cloudapp.azure.com' -p '5000'$INSTANCE_ID

# confirm cloud-init is complete
cat /tmp/hello.txt
tail -f /var/log/cloud-init.log
tail -f /var/log/cloud-init-output.log

When we return to our master node, we should be able to see our agent nodes have registered and are visible when we run kubectl get nodes.

Next, we’ll run a couple of Jobs. These examples are from the excellent Kubernetes Up and Running (web | sample chapters | examples) by Kelsey Hightower (Google), Brendan Burns (Microsoft) and Joe Beda (Heptio).

We start with a “one shot” job which runs a single pod once until it terminates successfully. Then we run 5 jobs in parallel for a total of 10 completed jobs.

# 4. kubectl job (on master node)
# -------------------------------

kubectl create -f https://raw.githubusercontent.com/kubernetes-up-and-running/examples/master/10-1-job-oneshot.yaml

kubectl describe jobs/oneshot

kubectl create -f https://raw.githubusercontent.com/kubernetes-up-and-running/examples/master/10-3-job-parallel.yaml

kubectl describe jobs/parallel

kubectl get pods

Kubernetes is a fantastic way to reliably run workloads across a pool of compute, whether you need to optimize for uptime for an application that needs to be kept online regardless of node or more widespread failures, or for cost, by running it atop more cost-effective compute that can be “evicted” any time.

kubeadm is a great way to explore Kubernetes and all of its components in any environment and with the operating system of your choice.

A few points to consider:

  • Though you may find this example useful if optimizing for compute cost rather than uptime, kubeadm is designed to be “a simple way for new users to start trying Kubernetes out” and not recommended for production deployments with high uptime requirements. For example, we only have one master node so our control plane is not highly available.
  • We are using 100% Low-Priority VMs in this instance. If we wish to ensure our jobs will continue running whether or not Low-Priority nodes are available, we could create multiple pools, including one that includes Standard VMs that are guaranteed to be always running.
  • kubeadm doesn’t automatically configure the Azure Cloud Provider in Kubernetes like AKS and acs-engine do (which I will cover in an upcoming post). This is fine if we’re using it for Jobs and we do not need a Service that exposes pods via an Azure Load Balancer, or Persistent Volumes backed by Azure Disks or Azure Files, etc.
  • If we wanted to take our cost-savings further, we could even use the B-Series burstable VM types for our Master VM (running on a Standard vs Low-Priority node). This also works very well for a single-node Kubernetes instance for test/dev where you could schedule Pods on the master node.

We’ll explore more of the above in a future post. Questions or feedback welcome! You can reach out to me any time on twitter via @as_w, and my DMs are always open.

© Aaron|MSFT ~ "My Software Fixes Things" ~ @as_w ~ aka.ms/aaronw