Gentoo on AWS

Published on Author Artem Butusov9 Comments

Introduction

This article was recently updated (as of 2018-08-19) to better fit current realities ;-).

There are a lot of Gentoo AMI images available for free on Amazon Marketplace but all of them are either outdated or are from untrusted sources. By the way, the lack of official guidelines how to install Gentoo on EC2 instance or supported by Gentoo community AMI images creates an additional barrier for people who would like to try Gentoo on EC2 server.

The goal of this article is to explain how to create minimal bootable Gentoo AMI image with all needed kernel modules that could be used for spawning new instances step by stepe.

AWS has two virtualization types available for Linux platform: PVM and HVM. These days PVM is used on old instance types only and is not available for new instance types so this article will cover only HVM-based installation.

This article is also using default Gentoo RC system – OpenRC. This manual won’t work for systemd without changes – kernel configuration fixes, genkernel-next instead of genkernel, all services should be references as systemd units and ec2-init script need to be installed differently.

This article is targeting people who already familiar with Gentoo Linux. Please follow official Gentoo Handbook in case of any questions.

NOTE: There is also script available that could perform all steps from this article in automated way: https://github.com/sormy/gentoo-ami-builder

Installation plan

There are a few different ways to get bootable Gentoo AMI.

The Way #1:

  • run any instance with linux os on first drive
  • install gentoo on second drive
  • reboot to gentoo on second drive
  • clone gentoo from second drive to first drive
  • reboot to gentoo on first drive
  • create image from first drive
  • terminate instance
  • NOTE: There is no way as far as I know to create an AMI image from non-root block device volume, so we have to move system from disk to disk to get AMI for Gentoo.

The Way #2:

  • prepare gentoo image locally
  • covert local image into format acceptable by AWS
  • import AMI image: https://aws.amazon.com/ec2/vm-import/

This article will cover first way. This plan requires no local computing resources and everything could be done in cloud.

Prepare instance

Spawn instance

Logon to AWS console and create instance:

STEP 1: CHOOSE AMI

Choose “Amazon Linux 2 AMI (HVM), SSD Volume Type”

NOTE: Amazon Linux has a copy of kernel configuration that could be used to build Gentoo kernel identical to Amazon Linux kernel and supported by Amazon kernel obviously supports all AWS instances and their devices.

STEP 2: CHOOSE INSTANCE TYPE

It is highly recommended to use compute optimized instance with 8 cores. Good options are c5.2xlarge and t2.2xlarge (but with unlimited cpu credits otherwise t2 instance could be throttled and everything will be slow in that case). The process should not take more than 1 hour so it is not expensive at all to take a powerful instance for these needs.

STEP 3: CONFIGURE INSTANCE

This screen could be skipped with one exception. Sometimes it could be useful to explicitly set availability zone to make sure that already existing volume in the same zone could be attached to new instance. In case if mistake in procedure sometimes it could be an easier option to reconnect volume to new instance rather than restarting everything from the scratch.

STEP 4: ADD STORAGE

Use 20GiB for Root (target) and 20GiB for /dev/sdb (temporary).

STEP 5: ADD TAGS

Could be skipped.

STEP 6: CONFIGURE SECURITY GROUP

Let amazon create default security group which allows to connect via SSH.

STEP 7: REVIEW

Create or import SSH public key and confirm. This key should be used to connect over SSH from local computer to EC2 instance.

Connect to instance

Grab IP address for running instance from AWS console, 1.2.3.4, for example.

Amazon Linux has “ec2-user” default user name.

From Linux/macOS terminal: ssh ec2-user@1.2.3.4

It could be required to clean old server SSH certificate from ~/.ssh/known_hosts if the same instance was used previously and current server SSH certificate is not the same as cached on local computer.

On Windows: download PuTTY, install and use IP address and “ec2-user” username to connect to SSH server.

Run screen after logging into instance to save session with running commands even if for some reason connection will be lost. Run screen -r to restore the session.

On Amazon Linux we need to run sudo bash to get superuser privileges.

Prepare root on disk 2

Set correct time

Having wrong time will cause all kinds of troubles.

Install ntpd and sync time from server:

yum -y -q install ntp
ntpd -gq

Prepare disk

All operations will be performed on second disk on initial phase because Amazon Linux is running from first disk and first disk is busy.

Design partition scheme

Please note, C5-like instances have NVMe and their device naming convention is /dev/nvmeXnY instead of /dev/xvdX

List all available disks:

lsblk
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  20G  0 disk
└─xvda1 202:1    0  20G  0 part /
xvdb    202:16   0  20G  0 disk

20GB is usually enough for basic Gentoo installation. 18GB for root partition and 2GB for swap (in case if instance type has less than 2GB of RAM).

Create partitions

cfdisk /dev/xvdb

This console tool has a nice and simple interface:

  • create new partition for root: primary, bootable, 18G
  • create new partition for swap: primary, 2G
  • write
  • quit

Create file system

mkfs.ext4 /dev/xvdb1
e2label /dev/xvdb1 temp-rootfs
mkswap /dev/xvdb2
swaplabel /dev/xvdb2 -L swap

Mount partitions

mkdir -p /mnt/gentoo
mount /dev/xvdb1 /mnt/gentoo
swapon /dev/xvdb2

Stage3 and Portage snapshot

Look on http://distfiles.gentoo.org/releases/ for fresh stage3 file and paste URL into terminal after “wget” to download latest version.

Download files

cd /mnt/gentoo
wget http://distfiles.gentoo.org/releases/amd64/autobuilds/current-stage3-amd64/stage3-amd64-YYYYMMDDTHHMMSSZ.tar.xz
wget http://distfiles.gentoo.org/releases/snapshots/current/portage-latest.tar.xz

Unpack files

tar xvpf stage3-*.tar.xz --xattrs-include='*.*' --numeric-owner
tar xvf portage-latest.tar.xz -C /mnt/gentoo/usr --xattrs-include='*.*' --numeric-owner

Source tarballs could be removed after unpacking:

rm stage3-*.tar.xz portage-latest.tar.xz

Copy Amazon’s kernel configs

This kernel config will be used as a default for Gentoo’s kernel to save time on manual kernel configuration.

mkdir -p /mnt/gentoo/etc/kernels
cp -fv /boot/config-* /mnt/gentoo/etc/kernels

Build root on disk 2

Change root

Copy DNS settings from active environment to new one (otherwise nothing could be downloaded after chroot due to unknown dns server):

cp /etc/resolv.conf /mnt/gentoo/etc/

Mount proc/dev:

mount -t proc none /mnt/gentoo/proc
mount -o bind /sys /mnt/gentoo/sys
mount -o bind /dev /mnt/gentoo/dev
mount -o bind /dev/pts /mnt/gentoo/dev/pts

Chroot:

chroot /mnt/gentoo /bin/bash
env-update
source /etc/profile

Configure

Compiler

Options:

  • Use -mtune=generic to get system which will works on any AWS equipment and any instance type.
  • Use -jN where N is number of cpu plus 1 to compile everything with more threads.
  • Use custom USE value to provide more information about what will you use.
  • Use USE="-bindist" to recompile from source code some packages that are provided as binaries.
  • Use GRUB_PLATFORMS="pc" to disable default EFI platform and speedup compilation.
nano /etc/portage/make.conf

Example:

# leave values the same if they are not referenced below
CFLAGS="-O2 -pipe -mtune=generic"
CXXFLAGS="${CFLAGS}"
USE="-bindist"
MAKEOPTS="-j2"
GRUB_PLATFORMS="pc"

OpenRC

Edit rc.conf:

nano /etc/rc.conf

Change values:

rc_logger="YES"
unicode="YES"
rc_sys=""

Locale

Edit locale.gen:

nano /etc/locale.gen

Uncomment this one:

en_US.UTF-8 UTF-8

Generate locales:

locale-gen

Use eselect locale list and eselect locale set to set default locale to UTF-8:

# eselect locale list
Available targets for the LANG variable:
  [1]   C
  [2]   en_US.utf8
  [3]   POSIX
  [ ]   (free form)
# eselect locale set 2
Setting LANG to en_US.utf8 ...
Run ". /etc/profile" to update the variable in your shell.

Keymaps

There is no real keyboard this service id not needed:

rc-update delete keymaps boot

Timezone

Here “US/Eastern” is used as an example.

ln -sf /usr/share/zoneinfo/US/Eastern /etc/localtime
echo "US/Eastern" > /etc/timezone

Hostname

Only if it is planned to use this instance. If new instance will be spawned from the image then hostname will be replaced with default one.

Edit hostname:

nano /etc/conf.d/hostname

Set instance hostname:

hostname="artembutusov.com"

Please keep in mind that cloud unit script will replace hostname with default value during bootstrap.

Network

Create new interface and add it to auto start:

ln -s /etc/init.d/net.lo /etc/init.d/net.eth0
rc-update add net.eth0 default

DHCP will be used by default.

Edit hosts:

nano /etc/hosts

Only if it is planned to use this instance. If new instance will be spawned from the image then hostname will be replaced with default one.

Add hostname aliases and FQDN:

127.0.0.1       artembutusov.com artembutusov localhost

fstab

Edit fstab:

nano /etc/fstab

Comment all active lines and add new one:

LABEL="temp-rootfs"     /               ext4            noatime         0 1
LABEL="swap"            none            swap            sw              0 0

Install kernel

Install kernel sources

Install kernel sources:

emerge sys-kernel/gentoo-sources -av

Install genkernel

Install genkernel:

# usually util-linux need to be recompiled with static libs to resolve circular dependencies
echo "sys-apps/util-linux static-libs" > /etc/portage/package.use/genkernel

emerge sys-kernel/genkernel -av

Edit genkernel:

nano /etc/genkernel.conf

It is recommended to tune genkernel options, first related to compile threads and second related to genkernel verbosity level:

MAKEOPTS="-j2"
LOGLEVEL=2

By the way, these options could be also explicitly passed to genkernel as command line arguments: --loglevel=2 --makeopts=-jX

Build kernel

It is recommended to use Amazon’s kernel configuration as starting point to save time on configuration. Kernel configuration could be updated any time later to include or exclude additional options.

List available Amazon’s kernel config:

ls /etc/kernels/*amzn*

Some modules need to be compiled into kernel to be properly loaded by Gentoo, so some minor fixes needed for default kernel configuration provided by Amazon.

List of fixes:

  • Compile XEN BLKDEV into kernel otherwise Gentoo won’t be able to find root block device after boot.
  • Compile NVMe support into kernel otherwise Gentoo won’t be able to find root block device after boot on modern C5-like instances.
  • Enable enhanced networking for C4-like instances (IXGBEVF network module).
  • There is also ENA network module needed for C5-like instances but it is not included with kernel sources and need to be compiled separately.
sed -i \
    -e '/CONFIG_XEN_BLKDEV_FRONTEND/c\CONFIG_XEN_BLKDEV_FRONTEND=y' \
    -e '/CONFIG_NVME_CORE/c\CONFIG_NVME_CORE=y' \
    -e '/CONFIG_BLK_DEV_NVME/c\CONFIG_BLK_DEV_NVME=y' \
    -e '/CONFIG_IXGBEVF/c\CONFIG_IXGBEVF=y' \
    "/etc/kernels/config-4.1.10-17.31.amzn1.x86_64"

Build kernel and ramdisk in unattended way:

genkernel all --loglevel=2 --makeopts=-jX --kernel-config=/etc/kernels/config-4.1.10-17.31.amzn1.x86_64

Install ENA kernel module

C5-like instances use ENA driver for network. ENA support need to be added to system if there are plans to launch C5-instances with target AMI image.

ENA kernel module is not available in official portage snapshot so local overlay need to be created:

mkdir -p "/etc/portage/repos.conf"

cat > "/etc/portage/repos.conf/local.conf" << END
[local]
location = /usr/local/portage
masters = gentoo
auto-sync = no
END

mkdir -p "/usr/local/portage"

mkdir -p "/usr/local/portage/metadata"

cat > "/usr/local/portage/metadata/layout.conf" << END
repo-name = local
masters = gentoo
thin-manifests = true
END

Add new ebuild into local overlay:

mkdir -p "/usr/local/portage/ena"

# download ENA ebuild and manifest to local repo
curl \
        -o "/usr/local/portage/net-misc/ena/ena-1.5.3.ebuild" \
        "https://raw.githubusercontent.com/sormy/gentoo/master/net-misc/ena/ena-1.5.3.ebuild" \
        -o "/usr/local/portage/net-misc/ena/Manifest" \
        "https://raw.githubusercontent.com/sormy/gentoo/master/net-misc/ena/ena/Manifest"
}

Install ENA kernel module:

emerge net-misc/ena -va

Enable auto load for ENA kernel module during boot

cat >> /etc/conf.d/modules << END
modules="ena"
END

Install bootloader

Install grub:

emerge sys-boot/grub -va

Configure grub defaults:

cat >> /etc/default/grub << END
GRUB_DEFAULT=0
GRUB_TIMEOUT=0
GRUB_TERMINAL="console serial"
GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0 console=tty0 console=ttyS0,115200n8"
END

Notes:

  • net.ifnames=0 will make device names easier, like eth0 we use
  • biosdevname=0 the same as previous but for systemd (confirm?)
  • GRUB_DEFAULT will boot first available kernel by default
  • GRUB_TIMEOUT will skip timout (no keyboard anyway to make a choice)
  • all other options are needed to properly enable serial console that could be used to grab console output from AWS console (for troubleshooting or debugging issues).

Install grub on second disk:

grub-install /dev/xvdb

Install grub config on second disk:

grub-mkconfig -o /boot/grub/grub.cfg

Enable serial console support after boot in inittab:

sed -i -e 's/^#\(.* ttyS0 .*$\)/\1/' /etc/inittab

Configure network

Create new device link:

ln -s /etc/init.d/net.lo /etc/init.d/net.eth0

Enable network service:

rc-update add net.eth0 default

Configure SSH

Remove password for root and lock password (SSH public key authentication will be used):

passwd -d -l root

Enable SSH service:

rc-update add sshd default

Install cloud init

AMI image should have a service that could be used to bootstrap basic configuration when new instance is spawned. Usually it is just hostname and ssh key to remotely log in.

There is a cloud-init package that is designed to initialize the instance during boot but this package is too big and is pulling a lot of other dependencies. SSH keys and hostname could be easier bootstrapped with custom made init script below:

touch /etc/init.d/amazon-ec2-init
chmod +x /etc/init.d/amazon-ec2-init
rc-update add amazon-ec2-init boot
nano /etc/init.d/amazon-ec2-init

Content:

#!/sbin/openrc-run

depend() {
    before hostname
    need net.eth0
}

start() {
    local lock="/var/lib/amazon-ec2-init.lock"
    local instance_id="$(wget -t 2 -T 5 -q -O - http://169.254.169.254/latest/meta-data/instance-id)"

    [ -f "$lock" ] && [ "$(cat "$lock")" = "$instance_id" ] && exit 0

    einfo "Provisioning instance..."

    eindent
    provision_hostname
    provision_ssh_authorized_keys
    eoutdent

    echo "$instance_id" > "$lock"
}

provision_hostname() {
    ebegin "Setting hostname"
    local hostname="$(wget -t 2 -T 5 -q -O - http://169.254.169.254/latest/meta-data/local-hostname)"
    echo "hostname=${hostname}" > /etc/conf.d/hostname
    eend $?
}

provision_ssh_authorized_keys() {
    ebegin "Importing SSH authorized keys"

    [ -e /root/.ssh ] && rm -rf /root/.ssh
    mkdir -p /root/.ssh
    chown root:root /root/.ssh
    chmod 750 /root/.ssh

    local keys=$(wget -t 2 -T 5 -q -O - http://169.254.169.254/latest/meta-data/public-keys/ \
        | cut -d = -f 1 \
        | xargs printf "http://169.254.169.254/latest/meta-data/public-keys/%s/openssh-key\n")

    if [ -n "${keys}" ]; then
        wget -t 2 -T 5 -q -O - ${keys} > /root/.ssh/authorized_keys
        chown root:root /root/.ssh/authorized_keys
        chmod 640 /root/.ssh/authorized_keys
    fi

    eend $?
}

This init script will fetch EC2 instance metadata and if instance id is different from previous launch then will set hostname and inject public ssh keys.

Switch root

Now it is time to fix bootloader configuration to boot Gentoo Linux from second drive instead of Amazon Linux from first drive. This will also give an option to migrate Gentoo Linux from second driver to first driver because first drive will no longer be in use.

Exit from chroot

Exit from chroot env:

exit

Copy kernel

Copy Gentoo kernel/ramdisk to first volume:

cp -v /mnt/gentoo/boot/*genkernel* /boot/
‘/mnt/gentoo/boot/initramfs-genkernel-x86_64-3.17.7-hardened-r1’ -> ‘/boot/initramfs-genkernel-x86_64-3.17.7-hardened-r1’
‘/mnt/gentoo/boot/kernel-genkernel-x86_64-3.17.7-hardened-r1’ -> ‘/boot/kernel-genkernel-x86_64-3.17.7-hardened-r1’
‘/mnt/gentoo/boot/System.map-genkernel-x86_64-3.17.7-hardened-r1’ -> ‘/boot/System.map-genkernel-x86_64-3.17.7-hardened-r1’

Configure bootloader

Now system is ready to boot into freshly build Gentoo located on second volume. To get that we will modify Amazon Linux bootloader config.

Edit Amazon Linux grub config:

nano /boot/grub/grub.cfg

Edit first Amazon Linux entry to use Gentoo kernel, Gentoo ramdisk and Gentoo kernel options and Gentoo root device:

menuentry ... {
    ...
    linux /boot/kernel-genkernel-x86_64-3.17.7-hardened-r1 root=LABEL=temp-rootfs net.ifnames=0 biosdevname=0 console=tty0 console=ttyS0,115200n8
    initrd /boot/initramfs-genkernel-x86_64-3.17.7-hardened-r1
}

Reboot

reboot

Gentoo should boot from second volume using bootloader on first volume.

Login into host again, this time with user “root”. Please note, ssh host key wont’ match with previously used by Amazon Linux so local ~/.ssh/known_hosts need to cleared from stale record.

There are 3 things that could go wrong here most likely:

  • Unable to connect to instance over SSH because network is down, because driver is not loaded or not compiled into kernel.
  • Unable to connect to instance because Gentoo can’t mount root block device due to wrong block device name provided in bootloader configuration or due to missing driver for root block device (not loaded or not compiled into kernel)
  • Mistake in bootloader configuration, grub can’t find ramdisk or kernel.

If something went wrong here then instance could be terminated, recreated again and second drive
with gentoo could be reattached to new instance to continue installation instead of restarting everything from the scratch:

sudo bash
mkdir /mnt/gentoo
mount /dev/xvdb1
/mnt/gentoo

Migrate root from disk 2 to disk 1

Prepare disk

mkfs.ext4 /dev/xvda1
e2label /dev/xvda1 cloudimg-rootfs
mkdir -p /mnt/gentoo
mount /dev/xvda1 /mnt/gentoo

Migrate files

Migrate files from second drive (now mounted on /) to first drive (now mounted on /mnt/gentoo):

cd /mnt/gentoo

# create auto generated directories
for i in home root media mnt opt proc sys dev tmp run; do
    mkdir $i
    touch $i/.keep
done

# fix permissions
chmod 700 root
chmod 1777 tmp

# copy everything with exception to autogenerated directories
eexec rsync --archive --xattrs --quiet \
    --exclude='/home' --exclude='/root' --exclude='/media' --exclude='/mnt' \
    --exclude='/opt' --exclude='/proc' --exclude='/sys' --exclude='/dev' \
    --exclude='/tmp' --exclude='/run' --exclude='/lost+found' \
    / /mnt/gentoo/

# clear ec2 init state if available
rm var/lib/amazon-ec2-init.lock

Configure bootloader

Mount proc/sys/dev/pts:

eexec mount -t proc none /mnt/gentoo/proc
eexec mount -o bind /sys /mnt/gentoo/sys
eexec mount -o bind /dev /mnt/gentoo/dev
eexec mount -o bind /dev/pts /mnt/gentoo/dev/pts

Another chroot is needed after copying the OS to reinstall bootloader:

chroot /mnt/gentoo /bin/bash
env-update
source /etc/profile

Install bootloader:

grub-install /dev/xvda

Regenerate grub config file:

grub-mkconfig -o /boot/grub/grub.cfg

Fix fstab – change LABEL=”rootfs” (second volume) to LABEL=”cloudimg-rootfs” (first volume):

nano /etc/fstab

Reboot

This time instance should boot the same Gentoo but from first drive using bootloader from first drive.

reboot

Finalize installation

Fix hostname

Only if it is planned to use this instance. If new instance will be spawned from the image then hostname will be replaced with default one.

nano /etc/conf.d/hostname
hostname="gentoo"

Fix hosts

Only if it is planned to use this instance. If new instance will be spawned from the image then hostname will be replaced with default one.

nano /etc/hosts
127.0.0.1       gentoo.local gentoo localhost

Cleanup

Remove stage3 and snapshot files from / if they are still here:

rm -v /stage3-* /portage-*

Detach and remove in AWS console attached second volume, because it is not needed anymore.

Rebuild world

It is highly recommended to make sure that whole system state is sychronized and all package are built with right use flags:

emerge --update --newuse --deep world --with-bdeps=y
revdep-rebuild

Install 3rd party tools

Run your own commands to fill default image with some preinstalled utitilies.

emerge eix && eix-update
emerge gentoolkit
emerge app-misc/mc
emerge syslog-ng
emerge logrotate
...

Force filesystem check and fix on reboot

Optional but could be useful:

touch /forcefsck

Create AMI image

Now go to AWS console, Instances view, choose instance, open menu and choose “Create Image”. Change settings if needed (remove swap, for example or change root block device default size) and initiate image creation.

The instance will go down. It could take from 9 to 15 mins to get AMI image creation process completed.

Once image will be created the instance could be terminated. Technically, instance could be even terminated before image creation will be completed.

Сonclusion

As a result there should be available AMI image in “AMIs” section and also snapshot associated with root block device used in AMI in “Snapshots” section.

Try to spawn a new instance from create AMI image to test how this image works.

9 Responses to Gentoo on AWS

  1. Hi Artem,

    Great blog about Gentoo on AWS.

    I made a mistake in my current Gentoo EC2 instance.
    I used -march=native in CFLAGS and now i can not start another instance from latest snapshot.

    Do i need to recompile whole system with CFLAGS=”-O2 -pipe” or only some important packages?
    How can i find necessary packages for the recompilation.

    Thank you.

    • You will need to recompile world with new compile options: emerge world –emptytree
      Update: You are right, you could recompile only packages with wrong GCC options if you know what are the packages. You could try to identify that list in /var/log/portage/elog/summary.log. Otherwise the only 100% way is to recompile world with –emptytree

  2. One gotcha for me was:

    mount -o bind /dev /mnt/gentoo/dev

    Should be:

    mount –rbind /dev /mnt/gentoo/dev

    After that, everything went swimmingly well.

    Thanks for making this guide!

  3. Worked like a charm! was able to create an AMI and launch a new instance no problem. Only two things I had to do differently was that the /etc/init.d/amazon-ec2 file didn’t exist but since you had the entire file was able to recreate and i had to do grub-install instead of grub2-install. Thanks!!!

  4. Great blog. One thing I noticed is that in my experience it requires more than 13G to genkernel. Probably you want to allocate more disk space for /dev/sdb to avoid the interrupt.

    BTW, c4.large is way more faster than t2.micro to get the kernel generated like 2 hours vs 1 day.

    • t* instances has burstable performance, if you run out of cpu credits then compilation could take 1 day 🙂 Based on my experience t2.micro could compile kernel in 1 hour if you have enough cpu credits.

  5. I can’t get past “ClientError: Unsupported kernel version 4.xx.yy”. Amazon Linux uses 4.9.38 but since that’s not available from either vanilla-sources or gentoo-sources I just pulled down the tarball from kernel.org and configured it per the instructions here. I don’t know if maybe this is the error that’s thrown if there’s something about the kernel config that’s unworkable or if it’s maybe comparing checksums against stock kernels that come with specific distro versions. Any ideas?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.