Memtierd NRI plugin
This plugins enables managing workloads with Memtierd in Kubernetes.
Plugin’s configuration defines a set of workload classes and their attributes. If a class is attributed with memtierd configuration, then this plugin will launch memtierd with that configuration to track and manage memory of each workload that belongs to the class.
The class of a workload is specified in pod annotations.
Workload configuration
The class of a pod or a container is defined using pod annotations:
annotations:
# Set the default class for all containers in this pod.
class.memtierd.nri.io: swap-idle-data
# Override the default class for the c0 container.
class.memtierd.nri.io/c0: track-working-set-size
# Do not associate any class on the c1 container.
class.memtierd.nri.io/c1: ""
Plugin configuration
Classes
Plugin configuration lists workload classes and their attributes.
classes:
is followed by list of maps with following keys and values:
name
(string): name of the class, matchesclass.memtierd.nri.io
annotations.allowswap
(true
orfalse
): iftrue
, allow OS to swap the workload. Iffalse
disallow swapping. If not set, the plugin will not affect what will be written tomemory.swap.max
in cgroups v2.memtierdconfig
(string): configuration template with which memtierd will be launched to manage workloads in this class. Variables that will be replaced with container-specific values in this template:$CGROUP2_ABS_PATH
absolute path to cgroups v2 directory into which container’s processes will belong to.
Example
classes:
- name: swap-idle-data
allowswap: true
memtierdconfig: |
policy:
name: age
config: |
intervalms: 10000
pidwatcher:
name: cgroups
config: |
cgroups:
- $CGROUP2_ABS_PATH
swapoutms: 10000
tracker:
name: idlepage
config: |
pagesinregion: 512
maxcountperregion: 1
scanintervalms: 10000
mover:
intervalms: 20
bandwidth: 50
The configuration defines the swap-idle-data
workload class.
allowswap: true
makes sure that OS will allow swapping when memtierd
decides that data should be swapped out from memory.
memtierdconfig: ...
means that a memtierd will manage the memory of
a workload in this class. The age
policy uses the idlepage
tracker
to find data that has not been accessed in 10 seconds, and swaps out
that data swapoutms: 10000
. The swapping will be done in 20 ms
interval (mover.intervalms
), and no more than 50 MB/s
(mover.bandwidth
). Refer to memtierd
documentation
for more configuration options.
Developer’s guide
Prerequisites
Containerd v1.7+
Enable NRI in /etc/containerd/config.toml:
[plugins."io.containerd.nri.v1.nri"] disable = false disable_connections = false plugin_config_path = "/etc/nri/conf.d" plugin_path = "/opt/nri/plugins" plugin_registration_timeout = "5s" plugin_request_timeout = "2s" socket_path = "/var/run/nri/nri.sock"
To run the nri-memtierd plugin on a host, install memtierd on the host.
GOBIN=/usr/local/bin go install github.com/intel/memtierd/cmd/memtierd@latest
Build
cd cmd/plugins/memtierd && go build .
Run
cmd/plugins/memtierd/memtierd -config sample-configs/nri-memtierd.yaml -idx 40 -vv
Manual test
kubectl create -f test/e2e/files/nri-memtierd-test-pod.yaml
See swap status of dd processes, each allocating the same amount of memory:
for pid in $(pidof dd); do
grep VmSwap /proc/$pid/status
done
Debug
-v
enables debug output from the plugin. -vv
makes it even more verbose.
The plugin stores memtierd
config and output under /tmp/memtierd/NAMESPACE/POD/CONTAINER/
.
Debugging the plugin with dlv:
go install github.com/go-delve/delve/cmd/dlv@latest
dlv exec ./memtierd -- -config memtierd.conf -idx 40
(dlv) break plugin.CreateContainer
(dlv) continue
Deploy
Build an image, import it on the node, and deploy the plugin by
running the following in nri-plugins
:
rm -rf build
make PLUGINS=nri-memtierd IMAGE_VERSION=devel images
ctr -n k8s.io images import build/images/nri-memtierd-image-*.tar
kubectl create -f build/images/nri-memtierd-deployment-e2e.yaml
The e2e deployment variant gives more debug output from both
nri-memtierd
plugin (see kubectl logs -n kube-system nri-memtierd-*
) and memtierd
to the output (see
/tmp/memtierd/**/*.output
).
Security
memtierd
needs privileged access in order to find pids in other
containers, track memory activity, move pages and swap workload data
out and in. Therefore only privileged users must be allowed to create
and modify memtierd configuration files and ConfigMaps. Commands in
memtierd configurations will be executed by memtierd in privileged
mode.