What Are cgroups (Control Groups)?

Disclaimer: ChatGPT generated document.

cgroups — short for control groups — are a Linux kernel feature that lets you limit, prioritize, account for, and isolate resource usage (CPU, memory, I/O, etc.) of processes.

They are one of the core building blocks of modern container systems like Docker and Kubernetes.

1️⃣ Why cgroups Exist

Before cgroups, Linux had:

nice for CPU priority
ulimit for per-process limits
setrlimit() syscall
cpusets for CPU affinity

But there was no:

Hierarchical resource control
Way to manage groups of processes
Unified interface for memory + CPU + I/O together

Google engineers developed cgroups in 2006 to solve this. It was merged into Linux 2.6.24 (2008).

2️⃣ Core Idea

A cgroup is a group of processes.

You can:

Limit their CPU time
Restrict memory usage
Limit disk I/O
Control network traffic
Track usage statistics

Unlike ulimit, cgroups:

Work hierarchically
Apply to groups
Are dynamic
Work well with containers

3️⃣ Architecture

The Three Core Concepts

1. Control Groups

A set of processes.

2. Controllers (a.k.a. Subsystems)

Each controller manages one type of resource:

CPU
Memory
I/O
PIDs
etc.

3. Hierarchy

cgroups are arranged in a tree.

Child groups inherit restrictions from parents.

4️⃣ cgroups v1 vs v2

There are two versions.

🔹 cgroups v1 (Legacy)

Released around 2008.

Features:

Each controller mounted separately
Flexible but messy
Controllers could be attached independently

Problems:

Complex
Controllers not unified
Hard to reason about
Inconsistent behavior

🔹 cgroups v2 (Unified Hierarchy)

Modern Linux uses this.

Introduced gradually (Linux 4.x+).

Major improvements:

Single unified hierarchy
Better delegation model
Cleaner resource model
Improved security
More predictable behavior

If you use modern systemd (like Ubuntu 22+, Debian 12+, etc.), you are likely using v2.

Check:

mount | grep cgroup

If you see cgroup2, you're on v2.

5️⃣ How cgroups Work (v2)

Mounted at:

/sys/fs/cgroup

Everything is a file.

Example:

/sys/fs/cgroup/mygroup/

Files inside:

cpu.max
memory.max
memory.current
pids.max
io.max

You control behavior by writing to these files.

6️⃣ Main Controllers (v2)

🔹 CPU Controller

Controls CPU bandwidth and distribution.

Key files:

cpu.max
cpu.weight
cpu.stat

Example:

echo "20000 100000" > cpu.max

Meaning:

20ms CPU time every 100ms
So max 20% CPU

🔹 Memory Controller

Controls RAM usage.

Files:

memory.max
memory.current
memory.high
memory.swap.max

Example:

echo 500M > memory.max

If exceeded:

OOM kill
Or throttling (if using memory.high)

Important difference:

memory.high → soft limit (throttle)
memory.max → hard limit (kill)

🔹 I/O Controller

Controls disk bandwidth and IOPS.

File:

io.max

Example:

echo "8:0 rbps=1048576" > io.max

Limits device 8:0 (e.g., /dev/sda) to 1MB/s read.

🔹 PIDs Controller

Limits number of processes.

echo 100 > pids.max

Prevents fork bombs.

🔹 Cpuset Controller

Limits which CPUs or NUMA nodes are allowed.

cpuset.cpus
cpuset.mems

Used heavily in high-performance systems.

7️⃣ How Containers Use cgroups

When you run:

docker run -m 512m --cpus=1 nginx

Docker:

Creates a new cgroup
Writes limits into memory.max and cpu.max
Adds container processes to the group

That’s it.

Containers are basically:

Namespaces (isolation)
cgroups (resource control)

8️⃣ Relationship With systemd

Modern Linux systems use systemd.

systemd:

Manages services via cgroups
Each service runs in its own cgroup
You can set limits in unit files:

Example:

[Service]
MemoryMax=500M
CPUQuota=50%

systemd translates this into cgroup settings.

9️⃣ Advanced Topics

🔹 Delegation

cgroups v2 supports safe delegation.

Example:

systemd owns root
It delegates a subtree to Docker
Docker manages containers inside that subtree

Security rule:

A process can only control its subtree.

🔹 Pressure Stall Information (PSI)

Modern kernels expose:

/proc/pressure/cpu
/proc/pressure/memory
/proc/pressure/io

This shows resource contention metrics.

Extremely useful for:

Performance tuning
Autoscaling systems

🔹 OOM Behavior

In v2:

OOM is per-cgroup
Not system-wide

Meaning:

Only that group gets killed
Not random system processes

🔹 cgroup Namespaces

Used by containers so they:

Only see their own subtree
Cannot see host hierarchy

🔹 Thread Mode

v2 supports thread-level resource distribution.

Rarely used directly but powerful.

🔟 Practical Example (Manual Setup)

Create group:

mkdir /sys/fs/cgroup/test

Limit memory:

echo 100M > /sys/fs/cgroup/test/memory.max

Add process:

echo <PID> > /sys/fs/cgroup/test/cgroup.procs

Now that process cannot exceed 100MB.

1️⃣1️⃣ Internal Kernel Mechanism

Internally:

Each task_struct has pointer to cgroup
Scheduler checks cpu controller
Memory allocator checks memory controller
I/O layer checks io controller

Hooks exist in:

Scheduler
Memory allocator
VFS layer
Block layer

It is deeply integrated into kernel subsystems.

1️⃣2️⃣ Security Implications

cgroups prevent:

Resource exhaustion attacks
Fork bombs
Memory DoS
Container escape via resource abuse

But they are not isolation alone — namespaces + seccomp + capabilities are also required.

1️⃣3️⃣ Real-World Use Cases

Containers
Cloud platforms
Multi-tenant hosting
CI/CD runners
Serverless runtimes
HPC resource partitioning
Desktop app sandboxing

1️⃣4️⃣ Common Debugging Tools

systemd-cgls
systemd-cgtop
cat /proc/<pid>/cgroup

Or:

cat /sys/fs/cgroup/<group>/memory.current

1️⃣5️⃣ Common Pitfalls

❌ Assuming v1 behavior in v2

They are not compatible.

❌ Ignoring memory.high

Hard limits can cause abrupt OOM.

❌ Not understanding CPU throttling

cpu.max throttles, not caps frequency.

❌ Forgetting swap accounting

memory.swap.max must be configured.

1️⃣6️⃣ Performance Considerations

cgroups add minimal overhead
CPU controller uses CFS bandwidth control
Memory controller adds accounting overhead
I/O controller depends on I/O scheduler

For HPC or ultra-low-latency systems:

Fine-tuning may be necessary

1️⃣7️⃣ Future of cgroups

v1 is deprecated
v2 is standard
Deep integration with container runtimes
PSI becoming more important for autoscaling
eBPF integration improving observability

1️⃣8️⃣ Summary

cgroups are:

A Linux kernel feature
Used for resource control
Essential for containers
Hierarchical
File-based interface
Deeply integrated into kernel subsystems

They are one of the most important kernel features in modern cloud computing.

If you'd like, I can next:

Go deep into kernel implementation details
Show how Docker config maps to cgroups files
Explain how to use cgroups in high-performance C++ systems
Or build a minimal container runtime using namespaces + cgroups

Given your systems/programming background, I can go very low-level.

MangaD/cgroups.md

What Are cgroups (Control Groups)?

1️⃣ Why cgroups Exist

2️⃣ Core Idea

3️⃣ Architecture

The Three Core Concepts

1. Control Groups

2. Controllers (a.k.a. Subsystems)

3. Hierarchy

4️⃣ cgroups v1 vs v2

🔹 cgroups v1 (Legacy)

🔹 cgroups v2 (Unified Hierarchy)

5️⃣ How cgroups Work (v2)

6️⃣ Main Controllers (v2)

🔹 CPU Controller

🔹 Memory Controller

🔹 I/O Controller

🔹 PIDs Controller

🔹 Cpuset Controller

7️⃣ How Containers Use cgroups

8️⃣ Relationship With systemd

9️⃣ Advanced Topics

🔹 Delegation

🔹 Pressure Stall Information (PSI)

🔹 OOM Behavior

🔹 cgroup Namespaces

🔹 Thread Mode

🔟 Practical Example (Manual Setup)

1️⃣1️⃣ Internal Kernel Mechanism

1️⃣2️⃣ Security Implications

1️⃣3️⃣ Real-World Use Cases

1️⃣4️⃣ Common Debugging Tools

1️⃣5️⃣ Common Pitfalls

❌ Assuming v1 behavior in v2

❌ Ignoring memory.high

❌ Not understanding CPU throttling

❌ Forgetting swap accounting

1️⃣6️⃣ Performance Considerations

1️⃣7️⃣ Future of cgroups

1️⃣8️⃣ Summary