EC2 Spot deployments

In the previous sections we’ve already used EC2 Spot instances. We have learned so far that when using EC2 Spot Karpenter selects a diversified list of instances and uses the allocation strategy capacity-optimized-prioritized to select the EC2 Spot pools that are optimal to reduce the frequency of Spot terminations while still being ideal for reducing the waste of the pending pods to place.

In this section we will check how to set up the node termination handler to gracefully handle Spot terminations and rebalancing recommendation signals. To support this configuration we will use the default Provisioner.

Before moving to the exercises. Let’s apply Spot Best practices and make sure we handle Spot instances properly from now on.

Deploy Node-Termination-Handler on EC2 Spot Instances

When users requests On-Demand Instances from a pool to the point that the pool is depleted, the system will select a set of Spot Instances from the pool to be terminated. A Spot Instance pool is a set of unused EC2 instances with the same instance type (for example, m5.large), operating system, Availability Zone, and network platform. The Spot Instance is sent an interruption notice two minutes ahead to gracefully wrap up things.

Amazon EC2 terminates, your Spot Instance when Amazon EC2 needs the capacity back (or the Spot price exceeds the maximum price for your request). More recently Spot instances support also instance rebalancing recommendation. Amazon EC2 emits an instance rebalance recommendation signal to notify you that a Spot Instance is at an elevated risk of interruption. This signal gives you the opportunity to proactively rebalance your workloads across existing or new Spot Instances without having to wait for the two-minute Spot Instance interruption notice.

To capture this signals and handle graceful termination of our nodes, we will deploy a project called AWS Node Termination Handler. Node termination handler operates in two different modes Queue Mode and Instance Metadata Mode. We will use Instance Metadata mode. In this mode the aws-node-termination-handler Instance Metadata Service Monitor will run a small pod (DaemonSet) on each host to perform monitoring of IMDS paths like /spot or /events and react accordingly to drain and/or cordon the corresponding node.

To deploy the Node Termination Handler run the following command:

helm repo add eks https://aws.github.io/eks-charts
helm install aws-node-termination-handler \
             --namespace kube-system \
             --version 0.16.0 \
             --set nodeSelector."karpenter\\.sh/capacity-type"=spot \
             eks/aws-node-termination-handler

The helm command above does make use of the nodeSelector pointing to kubernetes.sh/capacity-type: spot. This way will only install the node-termination-handler in Spot nodes.

Create a Spot Deployment

Let’s create a deployment that uses Spot instances.

cat <<EOF > inflate-spot.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate-spot
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate-spot
  template:
    metadata:
      labels:
        app: inflate-spot
    spec:
      nodeSelector:
        intent: apps
        karpenter.sh/capacity-type: spot
      containers:
      - image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
        name: inflate-spot
        resources:
          requests:
            cpu: "1"
            memory: 256M
EOF
kubectl apply -f inflate-spot.yaml

Challenge

You can use Kube-ops-view or just plain kubectl cli to visualize the changes and answer the questions below. In the answers we will provide the CLI commands that will help you check the resposnes. Remember: to get the url of kube-ops-view you can run the following command kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'

Answer the following questions. You can expand each question to get a detailed answer and validate your understanding.

1) Scale up the number of replicas to 2. How can you confirm only the Spot instance has the Node-Termination-Handler installed ?

Click here to show the answer

2) (Optional) Is that all for EC2 Spot Best practices ?

Click here to show the answer

3) Scale both deployments to 0 replicas ?

Click here to show the answer

What Have we learned in this section :

In this section we have learned:

  • How to apply Spot best practices and deploy the AWS Node Termination handler to manage Spot terminations and rebalancing recommendations.

  • How future versions of Karpenter will enable a better integration of Spot Best practices by proactively managing rebalancing signals.