Scheduling

Manual Scheduling

To manually schedule at creation - nodeName:

apiVersion: v1
kind: Pod
metadata:
 name: nginx
 labels:
  name: nginx
spec:
 containers:
 - name: nginx
   image: nginx
   ports:
   - containerPort: 8080
 nodeName: node02

Or create a binding object:

apiVersion: v1
kind: Binding
metadata:
  name: nginx
target:
  apiVersion: v1
  kind: Node
  name: node02

Labels and Selectors

Filter via selectors

Labels in metadata

Can use:

Taints and Tolerations

  • Taint: Tell pod "dont schedule here"

    • We taint nodes

  • Toleration: "You can schedule here even with taint"

    • Tolerate taint=xyz

Taint effect defines what would happen to the pods if they do not tolerate the taint.

  • NoSchedule

  • PreferNoSchedule: Best effort

  • NoExecute: Happens to nodes on existing nodes

    • Once taint takes effect, existing node evicts pod unless meets NoEvict

Master nodes have NoSchedule

Node Selectors

We can add nodeSelectors to a pod, which will help with scheduling:

To label nodes:

Node Affinity

Other options:

Available

  • requiredDuringSchedulingIgnoredDuringExecution

  • preferredDuringSchedulingIgnoredDuringExecution

Resource Requirements

  • Can specify requirements with resource.requests

  • Can specify limits with resource.limits

Defaults is no limit, no requirements.

  • If no request, but we have limit, request = limit

  • Should atleast set requests to avoid starting a pod.

If pod uses too much RAM during usage, we will OOM kill.

We can set defaults for a namespace with LimitRange:

We can also set ResourceQuota request and limit for a namespace.

You cant adjust limits on pod without deletion, you can on deployment. Deployment will re-create.

DaemonSets

Run one copy of pod on every node in cluster.

Matadata very similar to ReplicaSet:

Under the hood uses affinity.

Static Pods

Kubelet can read from /etc/kubernetes/manifests instead of talking to kube-api

We can only use pods, no complex deployments.

Check --pod-manifest-path or (--kubeconfig for staticPodPath:)

We can view these by listing containers:

  • crictl ps

  • nerdctl ps

  • docker ps

Cluster is aware of static pods, but we can't edit them outside manifests.

Kubeadm sets up some services this way.

Multiple Schedulers

We can add custom schedulers.

If using process, name should match systemd service which points at yaml config with --config

If scheduler in pod, simply deploy as normal pod/deployment:

Configure Multiple Schedulers

On pod creation, direct pod to use custom scheduler:

Scheduler Profiles

Scheduling has various stages, each can have associated plugins:

  • Scheduling queue

  • Filtering

  • Scoring

  • Binding

To customize plugins for each phase we have extension points

We can set multiple profiles for one scheduler binary:

Last updated