uechi.io/source/_posts/2021/installing-arch-linux.md
2022-06-22 12:01:59 +09:00

25 KiB

title date
Setting Up Arch Linux 2021-02-12T00:00:00

This note includes all commands I typed when I set up Arch Linux on my new server.

PSA: I published a toolchain for building AUR packages in a clean-room docker container https://github.com/uetchy/archpkgs

Setup

Wipe a device

wipefs -a /dev/sda # erase file-system (fast)

# or

shred -v -n 1 --random-source=/dev/urandom /dev/sda # write pseudo-random data to the device (takes longer)

Create partitions

parted
select /dev/sda
mktable gpt
mkpart EFI fat32 0 512MB # EFI (Extensible Firmware Interface)
mkpart Arch ext4 512MB 100% # Arch
set 1 esp on # flag partition 1 as ESP (EFI System Partition)
quit

NOTE: Since the server has 128GB of physical memory, I would rather let OOM Killer do its job than create a swap partition. Should the need for swap come up later, consider swapfile (no difference in performance)

# create 8GB swap file
dd if=/dev/zero of=/swapfile bs=1M count=8000 status=progress
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo "/swapfile none swap defaults 0 0" >> /etc/fstab

Install file-system for UEFI/GPT setup

mkfs.vfat -F 32 /dev/sda1 # EFI
mkfs.ext4 /dev/sda2 # Arch

Mount disks

mkdir -p /mnt/boot
mount /dev/sda2 /mnt
mount /dev/sda1 /mnt/boot

Install the base system

# choose between 'linux' and 'linux-lts'
pacstrap /mnt base linux-lts linux-firmware
arch-chroot /mnt

Configure pacman

# optimize mirrorlist
pacman -S reflector
reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose # optimize mirror list

# enable color output
sed '/Color/s/^#//' -i /etc/pacman.conf

Install essentials

pacman -S vim man-db man-pages git base-devel

Configure fstab

pacman -S xfsprogs # for XFS

genfstab -U /mnt >> /etc/fstab # Generate fstab based off of /mnt
# backup
UUID=<UUID> /mnt/backup ext4 defaults 0 2

# archive (do not prevent boot even if fsck fails)
UUID=<UUID> /mnt/archive ext4 defaults,nofail,x-systemd.device-timeout=4 0 2

# xfs drive
UUID=<UUID> /mnt/archive2 xfs defaults 0 0

You can find <UUID> from lsblk -f.

findmnt --verify --verbose # verify fstab

Install bootloader (GRUB 2)

pacman -S \
  grub \
  efibootmgr \
  amd-ucode # AMD microcode
grub-install --target=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB

vim /etc/default/grub
sed -E \
  -e 's/(GRUB_TIMEOUT=).+/\13/' \
  -e 's/(GRUB_DISABLE_SUBMENU=).+/\1y/' \
  -i /etc/default/grub
grub-mkconfig -o /boot/grub/grub.cfg

Configure locales

ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime
hwclock --systohc
vim /etc/locale.gen
locale-gen
echo "LANG=en_US.UTF-8" > /etc/locale.conf

Setup network

hostnamectl set-hostname takos
hostnamectl set-chassis server
127.0.0.1 localhost
::1       localhost
127.0.0.1 takos
[Match]
Name=enp5s0

[Network]
#DHCP=yes
Address=10.0.1.2/24
Gateway=10.0.1.1
DNS=10.0.1.100   # Self-hosted recursive DNS resolver
DNS=1.1.1.1      # Cloudflare for the fallback DNS resolver
MACVLAN=dns-shim # to route local DNS lookup to 10.0.1.100, which is managed by Docker MACVLAN driver
# to route local dns lookups to 10.0.1.100
[NetDev]
Name=dns-shim
Kind=macvlan

[MACVLAN]
Mode=bridge
# to route local dns lookups to 10.0.1.100
[Match]
Name=dns-shim

[Network]
IPForward=yes

[Address]
Address=10.0.1.103/32
Scope=link

[Route]
Destination=10.0.1.100/30

ip equivalent to the above config:

ip link add dns-shim link enp5s0 type macvlan mode bridge # add macvlan shim interface
ip a add 10.0.1.103/32 dev dns-shim # assign the interface an ip address
ip link set dns-shim up # enable the interface
ip route add 10.0.1.100/30 dev dns-shim # route macvlan subnet (.100 - .103) to the interface
systemctl enable --now systemd-networkd
networkctl status

# for self-hosted DNS resolver
sed -E 's/^DNS=.*/DNS=10.0.1.100 1.1.1.1/' -i /etc/systemd/resolved.conf
systemctl enable --now systemd-resolved
ln -rsf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
resolvectl status
resolvectl query ddg.gg
drill ddg.gg

If networkctl keep showing enp5s0 as degraded, then run ip addr add 10.0.1.2/24 dev enp5s0 to manually assign static IP address for the workaround.

See also:

Leave chroot environment

exit # leave chroot
umount -R /mnt
reboot

sysctl

# reboot after 60s of kernel panic
echo "kernel.panic = 60" > /etc/sysctl.d/98-kernel-panic.conf

# set swappiness
echo "vm.swappiness = 10" > /etc/sysctl.d/99-swappiness.conf

NTP

timedatectl set-ntp true
timedatectl status

AUR

git clone https://aur.archlinux.org/yay.git
cd yay
makepkg -si

Shell

pacman -S zsh
chsh -s /bin/zsh

# Install useful utils (totally optional)
yay -S pyenv exa antibody direnv fd ripgrep fzy peco ghq-bin hub neofetch tmux git-delta lazygit jq lostfiles ncdu htop rsync youtube-dl prettier age

Setup operator user (i.e., a user without a root privilege)

passwd # change root password

useradd -m -s /bin/zsh <user> # add operator user
passwd <user> # change operator user password

userdbctl # show users
userdbctl group # show groups

pacman -S sudo # install sudo

# add operator user to sudo group
groupadd sudo
usermod -aG sudo <user>
echo "%sudo ALL=(ALL) NOPASSWD:/usr/bin/pacman" > /etc/sudoers.d/pacman # allow users in sudo group to run pacman without password (optional)
visudo -c # validate sudoers

SSH

pacman -S openssh
vim /etc/ssh/sshd_config
systemctl enable --now sshd

on the client machine:

# brew install ssh-copy-id
ssh-copy-id <user>@<serverIp>

Make SSH forwarding work with tmux and sudo

if [ ! -S ~/.ssh/ssh_auth_sock ] && [ -S "$SSH_AUTH_SOCK" ]; then
  ln -sf $SSH_AUTH_SOCK ~/.ssh/ssh_auth_sock
fi
set -g update-environment -r
setenv -g SSH_AUTH_SOCK $HOME/.ssh/ssh_auth_sock
Defaults env_keep += SSH_AUTH_SOCK

See also: Happy ssh agent forwarding for tmux/screen · Reboot and Shine

MTA

mail

Needed for smartd.

pacman -S s-nail

sendmail

Needed for cfddns.

yay -S sendmail

S.M.A.R.T.

pacman -S smartmontools

Setup automated checkups

# scan all but removable devices and notify any test failures
DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02|L/../../6/03) -m me@example.com
systemctl enable --now smartd

Manual testing

smartctl -t short /dev/sda
smartctl -l selftest /dev/sda

See: S.M.A.R.T. - ArchWiki

lm_sensors

pacman -S lm_sensors

sensors-detect

systemctl enable --now lm_sensors

See: lm_sensors - ArchWiki

NVIDIA drivers

pacman -S nvidia-lts # 'nvidia' for 'linux'
reboot
nvidia-smi # validate runtime

Docker

pacman -S docker docker-compose
yay -S nvidia-container-runtime
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m", // default: -1 (unlimited)
    "max-file": "3" // default: 1
  },
  "runtimes": {
    // for docker-compose
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
systemctl enable --now docker

usermod -aG docker <user>

# to create mandatory device files on /dev
docker run --gpus all nvidia/cuda:10.2-cudnn7-runtime nvidia-smi

GPU_OPTS=(--gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools)
docker run --rm -it ${GPU_OPTS} nvidia/cuda:10.2-cudnn7-runtime nvidia-smi
docker run --rm -it ${GPU_OPTS} tensorflow/tensorflow:1.14.0-gpu-py3 bash

Use journald log driver in Docker Compose

services:
  web:
    logging:
      driver: "journald"
      options:
        tag: "{{.ImageName}}/{{.Name}}/{{.ID}}" # default: "{{.ID}}"

Backup (restic)

yay -S restic
[Unit]
Description=Restic Backup Service

[Service]
Nice=19
IOSchedulingClass=idle
KillSignal=SIGINT
ExecStart=/etc/restic/run.sh
[Unit]
Description=Restic Backup Timer

[Timer]
OnCalendar=*-*-* 0,6,12,18:0:0
RandomizedDelaySec=15min
Persistent=true

[Install]
WantedBy=timers.target
export RESTIC_REPOSITORY=/path/to/backup
export RESTIC_PASSWORD=<passphrase>
export RESTIC_PROGRESS_FPS=1
#!/bin/bash -ue
# usage: run.sh
# https://restic.readthedocs.io/en/latest/040_backup.html#

source /etc/restic/config.sh

date

# system
restic backup --tag system -v \
  --one-file-system \
  --exclude .cache \
  --exclude .vscode-server \
  --exclude .vscode-server-insiders \
  --exclude TabNine \
  --exclude /var/lib/docker/overlay2 \
  / /boot

# data
restic backup --tag data -v \
  --exclude 'appdata_*/preview' \ # Nextcloud cache
  --exclude 'appdata_*/dav-photocache' \ # Nextcloud cache
  /mnt/data

# prune
restic forget --prune --group-by tags \
  --keep-last 4 \
  --keep-within-daily 7d \
  --keep-within-weekly 1m \
  --keep-within-monthly 3m

# verify
restic check
#!/bin/bash
# usage: show.sh <file|directory>
# https://restic.readthedocs.io/en/latest/050_restore.html

source /etc/restic/config.sh

TARGET=${1:-$(pwd)}
MODE="ls -l"
if [[ -f $TARGET ]]; then
  TARGET=$(realpath ${TARGET})
  MODE=dump
fi

TAG=$(restic snapshots --json | jq -r '[.[].tags[0]] | unique | .[]' | fzy)
ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}')

>&2 echo "Command: restic ${MODE} ${ID} ${TARGET}"

restic $MODE $ID ${TARGET}
#!/bin/bash
# https://restic.readthedocs.io/en/latest/050_restore.html

source /etc/restic/config.sh

TARGET=${1:?Specify TARGET}
TARGET=$(realpath ${TARGET})

TAG=$(restic snapshots --json | jq -r '[.[].tags[0]] | unique | .[]' | fzy)
ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}')

>&2 echo "Command: restic restore ${ID} -i ${TARGET} -t /"

read -p "Press enter to continue"

restic restore $ID -i ${TARGET} -t /
chmod 700 /etc/restic/config.sh
ln -sf /etc/restic/restic.{service,timer} /etc/systemd/system/
systemctl enable --now restic.timer

VPN (WireGuard)

pacman -S wireguard-tools

# gen private key
(umask 0077; wg genkey > server.key)

# gen public key
wg pubkey < server.key > server.pub

# gen preshared key for each client
(umask 0077; wg genpsk > secret1.psk)
(umask 0077; wg genpsk > secret2.psk)
...
[Interface]
Address = 10.0.10.1/24
ListenPort = 12345
PrivateKey = <content of server.key>

PostUp   = iptables -A FORWARD -i %i -j ACCEPT; iptables -t nat -A POSTROUTING -o dns-shim -d 10.0.1.100/32 -j MASQUERADE; iptables -t nat -A POSTROUTING -o enp5s0 ! -d 10.0.1.100/32 -j MASQUERADE
PostDown = iptables -D FORWARD -i %i -j ACCEPT; iptables -t nat -D POSTROUTING -o dns-shim -d 10.0.1.100/32 -j MASQUERADE; iptables -t nat -D POSTROUTING -o enp5s0 ! -d 10.0.1.100/32 -j MASQUERADE

[Peer]
PublicKey = <public key>
PresharedKey = <content of secret1.psk>
AllowedIPs = 10.0.10.2/32

[Peer]
PublicKey = <public key>
PresharedKey = <content of secret2.psk>
AllowedIPs = 10.0.10.3/32
sysctl -w net.ipv4.ip_forward=1
ufw allow 12345/udp # if ufw is running
systemctl enable --now wg-quick@wg0
wg show # show active settings

Fail2ban

pacman -S fail2ban
[Definition]
failregex = .* client login failed: .+ client:\ <HOST>
ignoreregex =
journalmatch = CONTAINER_NAME=mail-front-1
[DEFAULT]
ignoreip = 127.0.0.1/8 10.0.1.0/24

[sshd]
enabled = true
port = 22,10122
bantime = 1h
mode = aggressive

# https://mailu.io/1.9/faq.html?highlight=fail2ban#do-you-support-fail2ban
[mailu]
enabled = true
backend = systemd
filter = bad-auth
findtime = 15m
maxretry = 10
bantime = 1w
action = docker-action
[Definition]

actionstart = iptables -N f2b-bad-auth
              iptables -A f2b-bad-auth -j RETURN
              iptables -I DOCKER-USER -p tcp -m multiport --dports 1:1024 -j f2b-bad-auth

actionstop = iptables -D DOCKER-USER -p tcp -m multiport --dports 1:1024 -j f2b-bad-auth
             iptables -F f2b-bad-auth
             iptables -X f2b-bad-auth

actioncheck = iptables -n -L DOCKER-USER | grep -q 'f2b-bad-auth[ \t]'

actionban = iptables -I f2b-bad-auth 1 -s <ip> -j DROP

actionunban = iptables -D f2b-bad-auth -s <ip> -j DROP
systemctl enable --now fail2ban
fail2ban-client status

cfddns

Dynamic DNS for Cloudflare.

Star the GitHub repository if you like it :)

yay -S cfddns
token: <token>
notification:
  enabled: true
  from: cfddns@localhost
  to: me@example.com
example.com
dev.example.com
example.org
systemctl enable --now cfddns

Reverse proxy (nginx-proxy)

git clone --recurse-submodules https://github.com/evertramos/nginx-proxy-automation.git /srv/proxy
cd /srv/proxy
./fresh-start.sh --yes -e your_email@domain --skip-docker-image-check

Nextcloud

git clone https://github.com/uetchy/docker-nextcloud.git /srv/cloud
cd /srv/cloud
cp .env.sample .env
vim .env # fill the blank variables
make # pull, build, start
make applypatches # run only once

Monitor (Telegraf + InfluxDB + Grafana)

Grafana + InfluxDB (Docker)

git clone https://github.com/uetchy/docker-monitor.git /srv/monitor
cd /srv/monitor

Telegraf (Host)

yay -S telegraf
# Global tags can be specified here in key="value" format.
[global_tags]

# Configuration for telegraf agent
[agent]
  interval = "15s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

# Read InfluxDB-formatted JSON metrics from one or more HTTP endpoints
[[outputs.influxdb]]
  urls = ["http://127.0.0.1:8086"]
  database = "<db>"
  username = "<user>"
  password = "<password>"

# Read metrics about cpu usage
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

# Read metrics about disk usage by mount point
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

# Read metrics about disk IO by device
[[inputs.diskio]]

# Get kernel statistics from /proc/stat
[[inputs.kernel]]

# Read metrics about memory usage
[[inputs.mem]]

# Get the number of processes and group them by status
[[inputs.processes]]

# Read metrics about system load & uptime
[[inputs.system]]

# Read metrics about network interface usage
[[inputs.net]]
  interfaces = ["enp5s0"]

# Read metrics about docker containers
[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  perdevice = false
  total = true

[[inputs.fail2ban]]
  interval = "15m"
  use_sudo = true

# Pulls statistics from nvidia GPUs attached to the host
[[inputs.nvidia_smi]]
  timeout = "30s"

[[inputs.http_response]]
  interval = "5m"
  urls = [
    "https://example.com"
  ]

# Monitor sensors, requires lm-sensors package
[[inputs.sensors]]
  interval = "60s"
  remove_numbers = false
Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status *
telegraf  ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN
Defaults!FAIL2BAN !logfile, !syslog, !pam_session
chmod 440 /etc/sudoers.d/telegraf
usermod -aG docker telegraf
telegraf -config /etc/telegraf/telegraf.conf -test
systemctl enable --now telegraf

Kubernetes

pacman -S minikube kubectl
minikube start --cpus=max
kubectl taint nodes --all node-role.kubernetes.io/master- # to allow the control plane to allocate pods to the master node

minikube ip
kubectl cluster-info
kubectl get cm -n kube-system kubeadm-config -o yaml

Firewall (ufw)

pacman -S ufw
systemctl enable --now ufw

See:

Audio

pacman -S alsa-utils # may require rebooting system
usermod -aG audio <user>

# list devices as root
aplay -l
arecord -L
cat /proc/asound/cards

# test speaker
speaker-test -c2

# test mic
arecord -vv -Dhw:2,0 -fS32_LE mic.wav
aplay mic.wav

# gui mixer
alsamixer

# for Mycroft.ai
pacman -S pulseaudio pulsemixer
pulseaudio --start
pacmd list-cards
# INPUT/RECORD
load-module module-alsa-source device="default" tsched=1
# OUTPUT/PLAYBACK
load-module module-alsa-sink device="default" tsched=1
# Accept clients -- very important
load-module module-native-protocol-unix
load-module module-native-protocol-tcp
pcm.mic {
  type hw
  card M96k
  rate 44100
  format S32_LE
}

pcm.speaker {
  type plug
  slave {
    pcm "hw:1,0"
  }
}

pcm.!default {
  type asym
  capture.pcm "mic"
  playback.pcm "speaker"
}

#defaults.pcm.card 1
#defaults.ctl.card 1

Telegram notifier

#!/bin/bash

BOT_TOKEN=<your bot token>
CHAT_ID=<your chat id>
PAYLOAD=$(ruby -r json -e "print ({text: ARGF.to_a.join, chat_id: $CHAT_ID}).to_json" </dev/stdin)

OK=$(curl -s -X "POST" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d "$PAYLOAD" \
  https://api.telegram.org/bot${BOT_TOKEN}/sendMessage | jq .ok)

if [[ $OK == true ]]; then
  exit 0
else
  exit 1
fi

LUKS

Encrypt existing un-encrypted root partition

Boot Arch Linux from archiso USB device, then:

pacman -Syu
resize2fs -p -M /dev/sdaX
# resize2fs -p /dev/sdaX 1838M
cryptsetup reencrypt --encrypt --reduce-device-size 32M /dev/sdaX
cryptsetup --allow-discards --perf-no_read_workqueue --perf-no_write_workqueue --persistent open /dev/sdaX root
resize2fs /dev/mapper/root

mkdir /mnt/{root,boot}
mount /dev/sda1 /mnt/boot
mount /dev/mapper/root /mnt/root
systemd-nspawn --boot --bind=/mnt/boot:/boot --directory=/mnt/root

# add sysroot line
vim /etc/mkinitcpio-systemd-tool/config/crypttab
vim /etc/mkinitcpio-systemd-tool/config/fstab
mkinitcpio -P
reboot

Maintenance

Quick checkups

htop # show task overview
systemctl --failed # show failed units
free -h # show memory usage
lsblk -f # show disk usage
networkctl status # show network status
userdbctl # show users
nvidia-smi # verify nvidia cards
ps aux | grep "defunct" # find zombie processes

Delve into system logs

journalctl -p err -b-1 -r # show error logs from previous boot in reverse order
journalctl -u sshd -f # tail logs from sshd unit
journalctl --no-pager -n 25 -k # show latest 25 logs from the kernel without pager
journalctl --since=yesterday --until "2020-07-10 15:10:00" # show logs within specific time range
journalctl CONTAINER_NAME=service_web_1 # show error from the docker container named 'service_web_1'
journalctl _PID=2434 -e # filter logs based on PID and jump to the end of the logs
journalctl -g 'timed out' # filter logs based on a regular expression. if the pattern is all lowercase, it will become insensitive mode
  • g - go to the first line
  • G - go to the last line
  • / - search for the string

Force overriding installation

pacman -S <pkg> --overwrite '*'

Check memory modules

pacman -S lshw dmidecode

lshw -short -C memory # lists installed mems
dmidecode # shows configured clock speed
smartctl -H /dev/sdd

# umount the drive before this ops
# [!] Never perform `fsck` on an unmounted LUKS drive, as it may lead to data loss
e2fsck -C 0 -p /dev/sdd1 # preen
e2fsck -C 0 -cc /dev/sdd1 # badblocks

Troubleshooting

Longer SSH login (D-bus glitch)

systemctl restart systemd-logind
systemctl restart polkit

Annoying systemd-homed is not available flooding journald logs

Move pam_unix before pam_systemd_home.

#%PAM-1.0

auth       required                    pam_faillock.so      preauth
# Optionally use requisite above if you do not want to prompt for the password
# on locked accounts.
auth       [success=2 default=ignore]  pam_unix.so          try_first_pass nullok
-auth      [success=1 default=ignore]  pam_systemd_home.so
auth       [default=die]               pam_faillock.so      authfail
auth       optional                    pam_permit.so
auth       required                    pam_env.so
auth       required                    pam_faillock.so      authsucc
# If you drop the above call to pam_faillock.so the lock will be done also
# on non-consecutive authentication failures.

account    [success=1 default=ignore]  pam_unix.so
-account   required                    pam_systemd_home.so
account    optional                    pam_permit.so
account    required                    pam_time.so

password   [success=1 default=ignore]  pam_unix.so          try_first_pass nullok shadow
-password  required                    pam_systemd_home.so
password   optional                    pam_permit.so

session    required                    pam_limits.so
session    required                    pam_unix.so
session    optional                    pam_permit.so

Annoying systemd-journald-audit log

Audit=no

Missing /dev/nvidia-{uvm*,modeset}

This usually happens right after updating the Linux kernel.

  • Run docker run --rm --gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi once.

[sudo] Incorrect password while password is correct

faillock --reset

Useful links