Scheduling

Manual Scheduling

To manually schedule at creation - nodeName:

apiVersion: v1
kind: Pod
metadata:
 name: nginx
 labels:
  name: nginx
spec:
 containers:
 - name: nginx
   image: nginx
   ports:
   - containerPort: 8080
 nodeName: node02

Or create a binding object:

apiVersion: v1
kind: Binding
metadata:
  name: nginx
target:
  apiVersion: v1
  kind: Node
  name: node02

Labels and Selectors

Filter via selectors

Labels in metadata

Can use:

kubectl get pods --selector app=nginx

Taints and Tolerations

Taint: Tell pod "dont schedule here"
- We taint nodes
Toleration: "You can schedule here even with taint"
- Tolerate taint=xyz

kubectl taint nodes
kubectl taint nodes <node-name> key=value:taint-effect

Taint effect defines what would happen to the pods if they do not tolerate the taint.

NoSchedule
PreferNoSchedule: Best effort
NoExecute: Happens to nodes on existing nodes
- Once taint takes effect, existing node evicts pod unless meets NoEvict

apiVersion: v1
kind: Pod
metadata:
 name: myapp-pod
spec:
 containers:
 - name: nginx-container
   image: nginx
 tolerations:
 - key: "app"
   operator: "Equal"
   value: "blue"
   effect: "NoSchedule"

Master nodes have NoSchedule

Node Selectors

We can add nodeSelectors to a pod, which will help with scheduling:

apiVersion: v1
kind: Pod
metadata:
 name: myapp-pod
spec:
 containers:
 - name: data-processor
   image: data-processor
 nodeSelector:
  size: Large

To label nodes:

kubectl label nodes <node-name> <label-key>=<label-value>
kubectl label nodes node-1 size=Large

Node Affinity

apiVersion: v1
kind: Pod
metadata:
 name: myapp-pod
spec:
 containers:
 - name: data-processor
   image: data-processor
 affinity:
   nodeAffinity:
     requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: size
            operator: In
            values:
            - Large
            - Medium

Other options:

   nodeAffinity:
     requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: size
            operator: NotIn
            values:
            - Small

   nodeAffinity:
     requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: size
            operator: Exists

Available

requiredDuringSchedulingIgnoredDuringExecution
preferredDuringSchedulingIgnoredDuringExecution

Resource Requirements

Can specify requirements with resource.requests
Can specify limits with resource.limits

apiVersion: v1
kind: Pod
metadata:
  name: simple-webapp-color
  labels:
    name: simple-webapp-color
spec:
 containers:
 - name: simple-webapp-color
   image: simple-webapp-color
   ports:
    - containerPort:  8080
   resources:
     requests:
      memory: "1Gi"
      cpu: "1"
     limits:
       memory: "2Gi"
       cpu: "2"

Defaults is no limit, no requirements.

If no request, but we have limit, request = limit
Should atleast set requests to avoid starting a pod.

If pod uses too much RAM during usage, we will OOM kill.

We can set defaults for a namespace with LimitRange:

We can also set ResourceQuota request and limit for a namespace.

You cant adjust limits on pod without deletion, you can on deployment. Deployment will re-create.

DaemonSets

Run one copy of pod on every node in cluster.

Matadata very similar to ReplicaSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: monitoring-daemon
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: monitoring-agent
  template:
    metadata:
     labels:
       app: monitoring-agent
    spec:
      containers:
      - name: monitoring-agent
        image: monitoring-agent

Under the hood uses affinity.

Static Pods

Kubelet can read from /etc/kubernetes/manifests instead of talking to kube-api

We can only use pods, no complex deployments.

Check --pod-manifest-path or (--kubeconfig for staticPodPath:)

We can view these by listing containers:

crictl ps
nerdctl ps
docker ps

Cluster is aware of static pods, but we can't edit them outside manifests.

Kubeadm sets up some services this way.

Multiple Schedulers

We can add custom schedulers.

apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
  - schedulerName: my-scheduler

If using process, name should match systemd service which points at yaml config with --config

If scheduler in pod, simply deploy as normal pod/deployment:

Configure Multiple Schedulers

On pod creation, direct pod to use custom scheduler:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - image: nginx
    name: nginx
  schedulerName: my-custom-scheduler

kubectl get events -o wide
kubectl logs my-custom-scheduler -n kube-system

Scheduler Profiles

Scheduling has various stages, each can have associated plugins:

Scheduling queue
Filtering
Scoring
Binding

To customize plugins for each phase we have extension points

We can set multiple profiles for one scheduler binary:

apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
  - schedulerName: my-scheduler
    plugins:
      score:
        disabled: []
        enabled: []

PreviousCore-Concepts NextLogging

Last updated 6 months ago