Quick start: Priority Core Turbo (PCT) in Kubernetes with Balloons
Balloons policy is an NRI plugin to container runtimes, containerd and CRI-O. The policy creates balloons that associate sets of containers with sets of CPUs. Containers belonging to a balloon are allowed to run on the CPUs belonging to the same balloon and not on CPUs belonging to other balloons. Balloons policy allows different CPU tunings on different balloons. This guide shows how to run high-priority containers with maximum turbo frequencies in Kubernetes by scheduling them on nodes with free PCT capacity, and by tuning their CPUs.
This short guide shows the minimum required to:
install the balloons NRI policy in a Kubernetes cluster,
configure it so that nodes with Intel Priority Core Turbo (PCT) hardware publish a
cpuclass.balloons.nri.io/hp-pctextended resource the scheduler can bin-pack on, andrun two Burstable pods on the same node – one that asks for HP cores and runs at the platform’s top turbo frequency, and one that does not and runs at the base frequency – and observe the performance difference with one
kubectl execper pod.
This guide uses balloons plugin’s managed PCT mode, that is, the plugin owns the SST-CP/SST-TF configuration, overriding whatever existing configuration from BIOS settings and/or the intel-speed-select tool. Balloons supports also assoc-only mode that uses pre-defined and only associates balloons’ CPUs to existing CLOSes.
This document does not cover manual configuration of underlying SST technology, step-by-step validation of real CPU frequencies, CPUs-to-containers mapping, benchmarking. These are covered in longer balloons PCT examples written separately for managed-mode and assoc-only-mode.
Prerequisites
A Kubernetes cluster (1.27 or newer) with NRI enabled in every node’s container runtime (containerd >= 1.7 or CRI-O >= 1.26).
At least one node that supports Intel SST-CP and SST-TF, e.g. Intel Xeon 6700P/6900P. Nodes without PCT-capable hardware will simply not publish the
cpuclass.balloons.nri.io/hp-pctextended resource, so HP pods naturally land on nodes that do.kubectlconfigured to talk to the cluster.
1. Install balloons with PCT enabled
helm install nri-resource-policy-balloons nri-plugins/nri-resource-policy-balloons --namespace kube-system --set allowPCT=true
--set allowPCT=true gives the plugin pod the privileged: true
security context and /dev mount it needs to talk to
/dev/isst_interface.
Wait for the daemonset to be Ready (one Pod per node):
kubectl -n kube-system rollout status ds/nri-resource-policy-balloons
2. Apply the minimal PCT policy
The policy below defines two cpuClasses and four balloonTypes.
cpuClasses:
default– the implicit fallback class. It carriespctPriority: low, which makes it the LP class: idle CPUs and every balloon whosecpuClassis unset (i.e. usesdefault) run on cores capped at base frequency. Defining an LP class is required: balloons routes idle CPUs to the LP CLOS so they do not inflate the active-HP-core count on each PCT power domain (punit).hp-pct–pctPriority: high. Containers in this class run on cores programmed for top turbo frequency. ThepublishExtendedResource: trueflag is what makes the scheduler see the per-node HP capacity.
balloonTypes (this part mirrors a typical Kubernetes CPU-manager-style split, with one extra balloon for HP pods):
reserved– the implicit kube-system balloon that runs onreservedResources.cpu.hp-bln– picked by theballoon.balloons.resource-policy.nri.io: hp-blnpod annotation. Uses thehp-pctcpuClass, so its containers get top-turbo HP cores.preferNewBalloons: trueputs each HP pod in a fresh balloon on a separate PCT power domain, andmaxCPUs: 8keeps the balloon within one bucket-0 HP budget per punit on Xeon 6.guaranteed– picked by Kubernetes pod QoS classGuaranteed. Containers get exclusive CPUs in the same spirit as Kubernetes’ built-in CPU manager.burstable– picked by Kubernetes pod QoS classBurstable. Containers share CPUs with other burstables.shareIdleCPUsInSame: packagelets a burstable container burst onto every otherwise-idle CPU in the same CPU package, giving the largest pool of burst CPUs that still preserves data locality (good balance between memory latency and bandwidth).
cat > balloons-pct.yaml <<EOF
apiVersion: config.nri/v1alpha1
kind: BalloonsPolicy
metadata:
name: default
namespace: kube-system
spec:
pinCPU: true
pinMemory: false
allocatorTopologyBalancing: true
balloonTypes:
- name: reserved
- name: hp-bln
cpuClass: hp-pct
maxCPUs: 8
preferNewBalloons: true
- name: guaranteed
matchExpressions:
- key: qosclass
operator: Equals
values: ["Guaranteed"]
preferNewBalloons: true
- name: burstable
matchExpressions:
- key: qosclass
operator: In
values: ["Burstable", "BestEffort"]
shareIdleCPUsInSame: package
minBalloons: 2
cpuClasses:
- name: default
pctPriority: low
pctMinFreq: min
pctMaxFreq: base
- name: hp-pct
pctPriority: high
pctMinFreq: base
pctMaxFreq: turbo
publishExtendedResource: true
EOF
kubectl apply -f balloons-pct.yaml
Verify that the cluster now advertises HP-core capacity per node. PCT-capable nodes show a non-zero number; other nodes omit the resource entirely.
kubectl get nodes -o json | jq -r '
.items[] | [.metadata.name,
(.status.capacity["cpuclass.balloons.nri.io/hp-pct"] // "-")]
| @tsv'
Example output (one PCT-capable node, two regular nodes):
node-pct-1 32
node-2 -
node-3 -
The HP capacity is the platform’s guaranteed top-turbo HP CPU count – the number of CPUs across the node that can simultaneously run at the highest turbo bucket. On a dual-socket Xeon 6776P with eight HP cores per punit and four active punits that is 32.
3. Deploy one HP pod and one normal pod
Both pods use the nginx:stable Docker Official image (which
ships with openssl) and stay idle waiting for kubectl exec.
hp-app– Guaranteed (CPU request equals limit, memory request equals limit). Requestscpuclass.balloons.nri.io/hp-pct: "2". That extended-resource request forces the scheduler to place the pod on a node that has at least two free HP CPUs, and theballoon.balloons.resource-policy.nri.io: hp-blnannotation routes it into the HP balloon.plain-app– Guaranteed (same shape, no extended resource, no annotation). It lands in theguaranteedballoon type by Kubernetes QoS class. It runs on LP cores.
Both pods request the same number of CPUs (2), so the
performance comparison reflects only the HP vs. LP cpuClass
difference, not the CPU count.
Bookkeeping rule. When using
publishExtendedResource, the number of HP CPUs requested viacpuclass.balloons.nri.io/hp-pctmust equal the pod’scpurequest. Each HP CPU consumed by the container has to be counted once in the scheduler’s extended-resource bookkeeping and once in normal CPU bookkeeping; mismatched counts let the scheduler oversubscribe HP CPUs on the node or, conversely, leave them stranded.
cat > pods.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: hp-app
annotations:
balloon.balloons.resource-policy.nri.io: hp-bln
spec:
containers:
- name: app
image: docker.io/library/nginx:stable
command: ["sleep", "3600"]
resources:
requests:
cpu: "2"
memory: "64Mi"
cpuclass.balloons.nri.io/hp-pct: "2"
limits:
cpu: "2"
memory: "64Mi"
cpuclass.balloons.nri.io/hp-pct: "2"
---
apiVersion: v1
kind: Pod
metadata:
name: plain-app
spec:
containers:
- name: app
image: docker.io/library/nginx:stable
command: ["sleep", "3600"]
resources:
requests:
cpu: "2"
memory: "64Mi"
limits:
cpu: "2"
memory: "64Mi"
EOF
kubectl apply -f pods.yaml
kubectl wait --for=condition=Ready --timeout=60s pod/hp-app pod/plain-app
If hp-app stays Pending with
FailedScheduling ... Insufficient cpuclass.balloons.nri.io/hp-pct,
either no node in the cluster has HP CPUs available (see step
2’s capacity listing) or every HP CPU is currently taken by
other HP pods. Drop the request to fit, or wait.
To confirm which balloon each pod landed in, ask the plugin:
kubectl -n kube-system logs ds/nri-resource-policy-balloons \
| grep -E 'assigning container default/(hp-app|plain-app)' | tail -2
Example:
assigning container default/hp-app/app to balloon hp-bln[0]{cpus:"32,160", mems:"1"}
assigning container default/plain-app/app to balloon guaranteed[0]{cpus:"64,192", mems:"2"}
hp-app is in the HP balloon; plain-app is in a fresh
guaranteed balloon with its own exclusive CPUs. Both pods
hold the same number of CPUs, so the throughput difference
in the next step reflects only the HP vs. LP frequency.
4. Observe the performance difference
nginx:stable ships with openssl, which has a built-in
self-benchmark (openssl speed) that measures cipher
throughput on the local CPU. Pure CPU work, no I/O, no
warm-up needed. Run it in both pods:
kubectl exec hp-app -- openssl speed -seconds 5 -evp aes-128-cbc 2>&1 | tail -2
kubectl exec plain-app -- openssl speed -seconds 5 -evp aes-128-cbc 2>&1 | tail -2
Example output (Xeon 6776P, 4.6 GHz HP vs. 2.3 GHz LP):
# hp-app
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 1788838.33k 2156328.92k 2205063.68k 2217644.85k 2221249.33k 2221617.97k
# plain-app
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-CBC 893471.74k 1075806.28k 1100445.44k 1106751.28k 1108534.89k 1108836.35k
The HP pod processes AES-128-CBC at roughly 2x the throughput
of the plain pod (2221617.97k vs. 1108836.35k bytes/s on
the 16 KB block size). The ratio mirrors the HP/LP frequency
ratio of the node and is reproducible from one invocation to
the next.
5. Clean up
kubectl delete -f pods.yaml
kubectl delete -f balloons-pct.yaml
helm -n kube-system uninstall balloons
rm -f balloons-pct.yaml pods.yaml
What next
For a deeper, fully verified walk-through of managed mode – including
intel-speed-selectinspection,NodeResourceTopologyverification, and per-podsysbench/turbostatreporting – see balloons-pct-example-auto.md.For the assoc-only PCT mode, where the operator owns the SST-CP/SST-TF configuration and balloons only associates CPUs to operator-programmed CLOSes, see balloons-pct-example-manual.md.
For the full balloons policy reference (other cpuClass fields, C-state control, cpufreq, scheduling-class integration), see balloons documentation.