The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.
Get the current release:
rpm -q system-release --qf "%{VERSION}\n"Find out the latest release:
sudo dnf check-release-update --latest-only --version-only
# Use the following for more verbose output
#sudo dnf check-release-updateTo upgrade the host for the current release:
sudo dnf check-update --refresh
sudo dnf upgrade --refreshTo upgrade the host to the latest release:
#sudo touch /etc/dnf/vars/releasever && echo 'latest' | sudo tee /etc/dnf/vars/releasever
sudo dnf check-update --refresh --releasever=latest
sudo dnf upgrade --refresh --releasever=latestInstall the following packages, which are good to have installed:
sudo dnf install --allowerasing -y \
kernel-modules-extra \
dnf-plugins-core \
dnf-utils \
dnf-plugin-support-info \
git-core \
git-lfs \
grubby \
kexec-tools \
chrony \
audit \
dbus \
dbus-daemon \
polkit \
systemd-pam \
systemd-container \
udisks2 \
nss-util \
nss-tools \
dmidecode \
nvme-cli \
lvm2 \
dosfstools \
e2fsprogs \
xfsprogs \
xfsprogs-xfs_scrub \
attr \
acl \
shadow-utils \
shadow-utils-subid \
fuse3 \
squashfs-tools \
star \
gzip \
pigz \
bzip2 \
zstd \
xz \
unzip \
p7zip \
numactl \
iproute \
iproute-tc \
iptables-nft \
nftables \
conntrack-tools \
ipset \
ethtool \
net-tools \
iputils \
traceroute \
mtr \
telnet \
whois \
socat \
bind-utils \
tcpdump \
cifs-utils \
nfsv4-client-utils \
nfs4-acl-tools \
libseccomp \
psutils \
python3 \
python3-pip \
python3-policycoreutils \
policycoreutils-python-utils \
bash-completion \
vim-minimal \
wget \
jq \
awscli-2 \
ec2rl \
ec2-utils \
htop \
sysstat \
fio \
inotify-tools \
rsyncsudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinuxsudo dnf install --allowerasing -y amazon-efs-utilsAmazon Linux now ships with the smart-restart package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a dnf <update|upgrade|downgrade> is executed.
The smart-restart uses the needs-restarting from the dnf-utils package and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).
sudo dnf install --allowerasing -y smart-restart python3-dnf-plugin-post-transaction-actionsAfter the installation, the subsequent transactions will trigger the smart-restart logic.
Run the following command to install and enable the kernel live patching feature:
sudo dnf install --allowerasing -y kpatch-dnf kpatch-runtime
sudo dnf kernel-livepatch -y auto
sudo systemctl enable --now kpatch.serviceRun the following command to remove the EC2 Hibernation Agent:
sudo dnf remove -y ec2-hibinit-agentInstall the Amazon SSM Agent:
sudo dnf install --allowerasing -y amazon-ssm-agentThe following is a tweak, which should resolve the following reported issue.
- https://repost.aws/questions/QU_tj7NQl6ReKoG53zzEqYOw/amazon-linux-2023-issue-with-installing-packages-with-cloud-init
- amazonlinux/amazon-linux-2023#397
Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the amazon ssm agent is started.
sudo mkdir -p /etc/systemd/system/amazon-ssm-agent.service.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-ssm-agent.service.d/00-override.conf
[Unit]
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target to depending on the service.
#
# See the follow for more details: https://serverfault.com/a/973985
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/usr/bin/amazon-ssm-agent
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-ssm-agent.service
sudo systemctl try-reload-or-restart amazon-ssm-agent.service
sudo systemctl status amazon-ssm-agent.serviceInstall the Unified CloudWatch Agent:
sudo dnf install --allowerasing -y amazon-cloudwatch-agent collectdAdd the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the unified cloudwatch agent is started.
sudo mkdir -p /etc/systemd/system/amazon-cloudwatch-agent.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-cloudwatch-agent.d/00-override.conf
[Unit]
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target depending on the service.
#
# See the follow for more details: https://serverfault.com/a/973985
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-cloudwatch-agent.service
sudo systemctl try-reload-or-restart amazon-cloudwatch-agent.service
sudo systemctl status amazon-cloudwatch-agent.serviceThe current version of the CloudWatchAgentServerPolicy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"logs:PutLogEvents",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:CreateLogStream",
"logs:CreateLogGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter"
],
"Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
}
]
}Run the following to install ansible on the host:
sudo dnf install -y \
python3-psutil \
ansible \
ansible-core \
sshpassLocale:
sudo localectl set-locale LANG=en_US.UTF-8
localectlHostname:
sudo hostnamectl set-hostname <hostname>
sudo hostnamectl set-chassis vm
hostnamectlSet the system timezone to UTC and ensure chronyd is enabled and started:
sudo timedatectl set-timezone Etc/UTC
sudo systemctl enable --now chronyd
sudo timedatectl set-ntp true
timedatectl
Logging:
sudo mkdir -p /etc/systemd/journald.conf.d
cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
[Journal]
SystemMaxUse=100M
RuntimeMaxUse=100M
RuntimeMaxFileSize=10M
RateLimitIntervals=1s
RateLimitBurst=10000
EOF
sudo systemctl daemon-reload
sudo systemctl restart systemd-journald.servicetouch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}mkdir -pv "${HOME}/bin"
mkdir -pv "${HOME}/.config/environment.d"
mkdir -pv "${HOME}/.config/systemd/user"
mkdir -pv "${HOME}/.config/systemd/user/sockets.target.wants"
mkdir -pv "${HOME}/.local/share/systemd/user"
mkdir -pv "${HOME}/.local/bin"#cat <<'EOF' | tee ~/.config/environment.d/environment_vars.conf
#PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
#
#EOFloginctl enable-linger $(whoami)
systemctl --user daemon-reloadIf you need to switch to root user, use the following instead of sudo su - <user>.
# sudo machinectl shell <username>@
sudo machinectl shell root@Run the following command to install moby aka docker:
sudo dnf install --allowerasing -y \
docker \
containerd \
runc \
container-selinux \
cni-plugins \
oci-add-hooks \
amazon-ecr-credential-helper \
udicaConfigure the following docker daemon settings:
sudo mkdir -p /etc/docker
cat <<'EOF' | sudo tee /etc/docker/daemon.json
{
"debug": false,
"experimental": false,
"exec-opts": ["native.cgroupdriver=systemd"],
"userland-proxy": false,
"live-restore": true,
"log-level": "warn",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
EOF- https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
- https://docs.docker.com/config/containers/logging/awslogs/
Add the current user e.g. ec2-user to the docker group:
sudo usermod -aG docker $USEREnable and start the docker service:
sudo systemctl enable --now docker
sudo systemctl status dockerInstall the Docker Compose plugin with the following commands:
# Install the docker compose plugin for all users
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
-o /usr/local/lib/docker/cli-plugins/docker-compose
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
&& sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-compose
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
&& sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose(Optional) To install for the local user, run the following commands:
mkdir -p "${HOME}/.docker/cli-plugins" \
&& touch "${HOME}/.docker/config.json"
cp /usr/local/lib/docker/cli-plugins/docker-compose "${HOME}/.docker/cli-plugins/docker-compose"
cat <<'EOF' | tee -a "${HOME}/.bashrc"
XDG_CONFIG_HOME="${HOME}/.config"
XDG_DATA_HOME="${HOME}/.local/share"
XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}"
DBUS_SESSION_BUS_ADDRESS="unix:path=${XDG_RUNTIME_DIR}/bus"
export XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR DBUS_SESSION_BUS_ADDRESS
#DOCKER_CONFIG=/usr/local/lib/docker
DOCKER_CONFIG="${DOCKER_CONFIG:-$HOME/.docker}"
DOCKER_TLS_VERIFY=1
export DOCKER_CONFIG DOCKER_TLS_VERIFY
#DOCKER_HOST="unix:///run/user/$(id -u)/docker.sock"
#export DOCKER_HOST
EOFVerify the plugin is installed correctly with the following command(s):
docker compose version(Optional) Install docker scout with the following commands:
<commands goes here>Note: You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.
(Optional) Install the docker buildx plugin with the following commands:
sudo curl -sSfL 'https://github.com/docker/buildx/releases/download/v0.14.0/buildx-v0.14.0.linux-amd64' \
-o /usr/local/lib/docker/cli-plugins/docker-buildx
#sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-buildx-linux-"$(uname -m)" \
# -o /usr/local/lib/docker/cli-plugins/docker-buildx
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
&& sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-buildx
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
&& sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx
cp /usr/local/lib/docker/cli-plugins/docker-buildx "${HOME}/.docker/cli-plugins/docker-buildx"
docker buildx installThis is mostly optional if needed, otherwise you can just skip this one.
sudo dnf install --allowerasing -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel
sudo usermod -aG ne $USER
sudo systemctl enable --now nitro-enclaves-allocator.service- https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave-cli-install.html
- https://github.com/aws/aws-nitro-enclaves-cli
To install the Nvidia drivers:
sudo dnf install -y wget kernel-modules-extra kernel-devel gccDownload the driver install script, run it and verify:
curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
sudo sh NVIDIA-Linux-x86_64-535.161.08.run -a -s --ui=none -m=kernel-open
nvidia-smiFor the Nvidia container runtime:
curl -sL 'https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo' | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo dnf check-update
sudo dnf install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart dockerTo create an Ubuntu based container with access to the host GPUs:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi# configure region
aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
# use regional endpoints
aws configure set default.sts_regional_endpoints regional
# get credentials from imds
aws configure set default.credential_source Ec2InstanceMetadata
# get credentials last for 1hr
aws configure set default.duration_seconds 3600
# set default pager
aws configure set default.cli_pager ""
# set output to json
aws configure set default.output jsonVerify:
aws configure list
aws sts get-caller-identityLogin to the AWS ECR service:
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.awsTo create an AL2023 based container:
docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash- https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
- https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades-usage.html
- Manage package and operating system updates in AL2023
- https://mobyproject.org/
- https://github.com/docker/docker-install
- https://github.com/docker/docker-ce-packaging
- https://download.docker.com/linux/static/stable/
- https://docs.docker.com/compose/install/linux/
- https://github.com/docker/compose/
- https://github.com/docker/docker-credential-helpers
- https://github.com/docker/buildx