--- title: Setting Up Arch Linux date: 2021-02-12T00:00:00 --- This note includes all commands I typed when I set up Arch Linux on my new server. > PSA: I published a toolchain for building AUR packages in a clean-room docker container > # Setup ## Wipe whole disk ```bash wipefs -a /dev/sda ``` ## Create partition ```bash parted ``` ```bash select /dev/sda mktable gpt mkpart EFI fat32 0 512MB # EFI mkpart Arch ext4 512MB 100% # Arch set 1 esp on # flag partition 1 as ESP quit ``` ## Install file-system ```bash mkfs.vfat -F 32 /dev/sda1 # EFI mkfs.ext4 /dev/sda2 # Arch e2fsck -cc -C 0 /dev/sda2 # fsck ``` ## Mount disk ```bash mkdir -p /mnt/boot mount /dev/sda2 /mnt mount /dev/sda1 /mnt/boot ``` ## Install base & Linux kernel ```bash # choose between 'linux' or 'linux-lts' pacstrap /mnt base linux-lts linux-firmware genfstab -U /mnt >> /mnt/etc/fstab arch-chroot /mnt ``` ```bash pacman -S reflector reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose # optimize mirror list ``` ## Install essentials ```bash pacman -S vim man-db man-pages git base-devel ``` ## Locales ```bash ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime hwclock --systohc vim /etc/locale.gen & locale-gen echo "LANG=en_US.UTF-8" > /etc/locale.conf ``` ## add fstab entries ```ini /etc/fstab # backup UUID= /mnt/backup ext4 defaults 0 2 # archive (do not prevent boot even if fsck fails) UUID= /mnt/archive ext4 defaults,nofail,x-systemd.device-timeout=4 0 2 ``` Find `` from the output of `lsblk -f`. ```bash findmnt --verify --verbose # verify fstab ``` ## Install bootloader ```bash pacman -S \ grub \ efibootmgr \ amd-ucode # AMD microcode grub-install --target=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB vim /etc/default/grub # GRUB_TIMEOUT=3 # GRUB_DISABLE_SUBMENU=y grub-mkconfig -o /boot/grub/grub.cfg ``` - [GRUB/Tips and tricks - ArchWiki](https://wiki.archlinux.org/title/GRUB/Tips_and_tricks) ## Setup network ```bash hostnamectl set-hostname takos hostnamectl set-chassis server ``` ```ini /etc/hosts 127.0.0.1 localhost ::1 localhost 127.0.0.1 takos ``` See https://systemd.network/systemd.network.html and https://wiki.archlinux.org/title/Systemd-networkd, and . ```ini /etc/systemd/network/wired.network [Match] Name=enp5s0 [Network] #DHCP=yes Address=10.0.1.2/24 Gateway=10.0.1.1 DNS=10.0.1.100 # self-hosted DNS resolver DNS=1.1.1.1 # Cloudflare for the fallback DNS server MACVLAN=dns-shim # to handle local dns lookup to 10.0.1.100 which is managed by Docker macvlan driver ``` ```ini /etc/systemd/network/dns-shim.netdev # to handle local dns lookup to 10.0.1.100 [NetDev] Name=dns-shim Kind=macvlan [MACVLAN] Mode=bridge ``` ```ini /etc/systemd/network/dns-shim.network # to handle local dns lookup to 10.0.1.100 [Match] Name=dns-shim [Network] IPForward=yes [Address] Address=10.0.1.103/32 Scope=link [Route] Destination=10.0.1.100/30 ``` `ip` equivalent to the above config: ```bash ip link add dns-shim link enp5s0 type macvlan mode bridge # add macvlan shim interface ip a add 10.0.1.103/32 dev dns-shim # assign the interface an ip address ip link set dns-shim up # enable the interface ip route add 10.0.1.100/30 dev dns-shim # route macvlan subnet (.100 - .103) to the interface ``` ```bash systemctl enable --now systemd-networkd networkctl status ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf # for self-hosted dns resolver sed -r -i -e 's/#?DNSStubListener=yes/DNSStubListener=no/g' -e 's/#DNS=/DNS=10.0.1.100/g' /etc/systemd/resolved.conf systemctl enable --now systemd-resolved resolvectl status resolvectl query ddg.gg drill ddg.gg ``` If `networkctl` keep showing `enp5s0` as `degraded`, then run `ip addr add 10.0.1.2/24 dev enp5s0 ` to manually assign static IP address for the workaround. ## Finalize ```bash exit # leave chroot umount -R /mnt reboot ``` ## NTP ```bash timedatectl set-ntp true timedatectl status ``` ## Shell ```bash pacman -S zsh chsh -s /bin/zsh git clone https://github.com/uetchy/dotfiles ~/.dotfiles yay -S ruby pyenv exa antibody direnv fd ripgrep fzy peco ghq-bin hub neofetch tmux git-delta lazygit jq lostfiles ncdu htop rsync youtube-dl prettier tree age informant usermod -aG informant cd ~/.dotfiles ./dot link zsh -f reload ``` ## Setup operator user (i.e. user without superuser privilege) ```bash passwd # change root passwd useradd -m -s /bin/zsh # add local user passwd # change local user password userdbctl # verify users userdbctl group # verify groups pacman -S sudo echo "%sudo ALL=(ALL) NOPASSWD:/usr/bin/pacman" > /etc/sudoers.d/pacman # allow users in sudo group to run pacman without password (optional) groupadd sudo usermod -aG sudo # add local user to sudo group visudo -c ``` ## SSH ```bash pacman -S openssh vim /etc/ssh/sshd_config systemctl enable --now sshd ``` on the host machine: ```bash ssh-copy-id @ ``` ```bash:$HOME/.ssh/rc if [ ! -S ~/.ssh/ssh_auth_sock ] && [ -S "$SSH_AUTH_SOCK" ]; then ln -sf $SSH_AUTH_SOCK ~/.ssh/ssh_auth_sock fi ``` See also: [Happy ssh agent forwarding for tmux/screen ยท Reboot and Shine](https://werat.dev/blog/happy-ssh-agent-forwarding/) ## AUR ```bash git clone https://aur.archlinux.org/yay.git cd yay makepkg -si ``` ## S.M.A.R.T. ```bash pacman -S smartmontools systemctl enable --now smartd smartctl -t short /dev/sda smartctl -l selftest /dev/sda ``` ## NVIDIA drivers ```bash pacman -S nvidia-lts # 'nvidia' for 'linux' package reboot nvidia-smi # test runtime ``` ## Docker https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html ```bash pacman -S docker docker-compose yay -S nvidia-container-runtime systemctl enable --now docker ``` ```json /etc/docker/daemon.json { "log-driver": "json-file", "log-opts": { "max-size": "10m", // default: -1 (unlimited) "max-file": "3" // default: 1 }, "runtimes": { // for docker-compose "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } } ``` ```bash systemctl restart docker usermod -aG docker # to create mandatory device files on /dev docker run --gpus all nvidia/cuda:10.2-cudnn7-runtime nvidia-smi GPU_OPTS=(--gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools) docker run --rm -it ${GPU_OPTS} nvidia/cuda:10.2-cudnn7-runtime nvidia-smi docker run --rm -it ${GPU_OPTS} tensorflow/tensorflow:1.14.0-gpu-py3 bash docker create network webproxy ``` ### Use `journald` log driver in Docker Compose ```yaml services: web: logging: driver: "journald" options: tag: "{{.ImageName}}/{{.Name}}/{{.ID}}" # default: "{{.ID}}" ``` - [Configure logging drivers | Docker Documentation](https://docs.docker.com/config/containers/logging/configure/) # Additional setup ## Fail2ban ``` pacman -S fail2ban ``` ```ini /etc/fail2ban/filter.d/bad-auth.conf [INCLUDES] before = common.conf [Definition] failregex = .* client login failed: .+ client:\ ignoreregex = ``` ```ini /etc/fail2ban/jail.local [DEFAULT] ignoreip = 127.0.0.1/8 10.0.1.0/24 [sshd] enabled = true port = 22,10122 bantime = 1h mode = aggressive # https://github.com/Mailu/Mailu/blob/master/docs/faq.rst#do-you-support-fail2ban [mailu] enabled = true backend = systemd journalmatch = CONTAINER_NAME=mail_front_1 filter = bad-auth findtime = 1h maxretry = 3 bantime = 1w chain = DOCKER-USER banaction = iptables-allports ``` ```patch /etc/systemd/system/fail2ban.service - After=network.target iptables.service firewalld.service ip6tables.service ipset.service nftables.service + After=network.target iptables.service firewalld.service ip6tables.service ipset.service nftables.service docker.service ``` ```bash systemctl enable --now fail2ban fail2ban-client status sshd ``` ## Telegraf ```bash yay -S telegraf ``` ```ini /etc/telegraf/telegraf.conf # Global tags can be specified here in key="value" format. [global_tags] # Configuration for telegraf agent [agent] interval = "15s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" hostname = "" omit_hostname = false # Read InfluxDB-formatted JSON metrics from one or more HTTP endpoints [[outputs.influxdb]] urls = ["http://127.0.0.1:8086"] database = "" username = "" password = "" # Read metrics about cpu usage [[inputs.cpu]] percpu = true totalcpu = true collect_cpu_time = false report_active = false # Read metrics about disk usage by mount point [[inputs.disk]] ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"] # Read metrics about disk IO by device [[inputs.diskio]] # Get kernel statistics from /proc/stat [[inputs.kernel]] # Read metrics about memory usage [[inputs.mem]] # Get the number of processes and group them by status [[inputs.processes]] # Read metrics about system load & uptime [[inputs.system]] # Read metrics about network interface usage [[inputs.net]] interfaces = ["enp5s0"] # Read metrics about docker containers [[inputs.docker]] endpoint = "unix:///var/run/docker.sock" perdevice = false total = true [[inputs.fail2ban]] interval = "15m" use_sudo = true # Pulls statistics from nvidia GPUs attached to the host [[inputs.nvidia_smi]] timeout = "30s" [[inputs.http_response]] interval = "5m" urls = [ "https://example.com" ] # Monitor sensors, requires lm-sensors package [[inputs.sensors]] interval = "60s" remove_numbers = false # Run executable as long-running input plugin [[inputs.execd]] interval = "15s" command = ["/metrics.sh"] name_override = "metrics" signal = "STDIN" restart_delay = "20s" data_format = "logfmt" ``` ```ini /etc/sudoers.d/telegraf Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status * telegraf ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN Defaults!FAIL2BAN !logfile, !syslog, !pam_session ``` ```bash chmod 440 /etc/sudoers.d/telegraf usermod -aG docker telegraf telegraf -config /etc/telegraf/telegraf.conf -test systemctl enable --now telegraf ``` ## cfddns Dynamic DNS for Cloudflare. > Star [the GitHub repository](https://github.com/uetchy/cfddns) if you like it :) ``` yay -S cfddns sendmail ``` ```yml /etc/cfddns/cfddns.yml token: notification: enabled: true from: cfddns@localhost to: me@example.com ``` ```ini /etc/cfddns/domains example.com dev.example.com example.org ``` ``` systemctl enable --now cfddns ``` ## Backup ```bash pacman -S restic ``` ```ini /etc/backup/restic.service [Unit] Description=Daily Backup Service [Service] Type=simple Nice=19 IOSchedulingClass=2 IOSchedulingPriority=7 ExecStart=/etc/backup/run.sh ``` ```ini /etc/backup/restic.timer [Unit] Description=Daily Backup Timer [Timer] WakeSystem=false OnCalendar=*-*-* 14:00 RandomizedDelaySec=5min [Install] WantedBy=timers.target ``` ```bash /etc/backup/run.sh #!/bin/bash -ue # usage: run.sh # https://restic.readthedocs.io/en/latest/040_backup.html# export RESTIC_REPOSITORY=/path/to/backup export RESTIC_PASSWORD= export RESTIC_PROGRESS_FPS=1 date # system restic backup --tag system -v \ --one-file-system \ --exclude .cache \ --exclude .venv \ --exclude .vscode-server \ --exclude .vscode-server-insiders \ --exclude TabNine \ --exclude node_modules \ --exclude /var/lib/docker/overlay2 \ / /boot # data restic backup --tag data -v \ --exclude 'appdata_*/preview' \ # nextcloud cache --exclude 'appdata_*/dav-photocache' \ # nextcloud cache /mnt/data # prune restic forget --prune --group-by tags \ --keep-within-daily 7d \ --keep-within-weekly 1m \ --keep-within-monthly 3m # verify restic check ``` ```bash /etc/backup/show.sh #!/bin/bash # usage: show.sh # https://restic.readthedocs.io/en/latest/050_restore.html export RESTIC_REPOSITORY=/path/to/backup export RESTIC_PASSWORD= export RESTIC_PROGRESS_FPS=1 TARGET=${1:-$(pwd)} MODE="ls -l" if [[ -f $TARGET ]]; then TARGET=$(realpath ${TARGET}) MODE=dump fi TAG=$(restic snapshots --json | jq -r '[.[].tags[0]] | unique| .[]' | fzy) ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}') >&2 echo "Command: restic ${MODE} ${ID} ${TARGET}" restic $MODE $ID ${TARGET} ``` ```bash /etc/backup/restore.sh #!/bin/bash # https://restic.readthedocs.io/en/latest/050_restore.html export RESTIC_REPOSITORY=/path/to/backup export RESTIC_PASSWORD= export RESTIC_PROGRESS_FPS=1 TARGET=${1:?Specify TARGET} TARGET=$(realpath ${TARGET}) TAG=$(restic snapshots --json | jq -r '[.[].tags[0]] | unique | .[]' | fzy) ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}') >&2 echo "Command: restic restore ${ID} -i ${TARGET} -t /" read -p "Press enter to continue" restic restore $ID -i ${TARGET} -t / ``` ```bash chmod 700 /etc/backup/{run,show}.sh ln -sf /etc/backup/restic.{service,timer} /etc/systemd/system/ systemctl enable --now restic ``` ## Kubernetes ```bash pacman -S minikube kubectl minikube start --cpus=max kubectl taint nodes --all node-role.kubernetes.io/master- # to allow allocating pods to the master node minikube ip kubectl cluster-info kubectl get cm -n kube-system kubeadm-config -o yaml ``` - [Kubernetes - ArchWiki](https://wiki.archlinux.org/index.php/Kubernetes) - [Kubernetes Ingress Controller with NGINX Reverse Proxy and Wildcard SSL from Let's Encrypt - Shogan.tech](https://www.shogan.co.uk/kubernetes/kubernetes-ingress-controller-with-nginx-reverse-proxy-and-wildcard-ssl-from-lets-encrypt/) ## Audio ```bash pacman -S alsa-utils # maybe requires rebooting system usermod -aG audio # list devices as root aplay -l arecord -L cat /proc/asound/cards # test speaker speaker-test -c2 # test mic arecord -vv -Dhw:2,0 -fS32_LE mic.wav aplay mic.wav # gui mixer alsamixer # for Mycroft.ai pacman -S pulseaudio pulsemixer pulseaudio --start pacmd list-cards ``` ```conf /etc/pulse/default.pa # INPUT/RECORD load-module module-alsa-source device="default" tsched=1 # OUTPUT/PLAYBACK load-module module-alsa-sink device="default" tsched=1 # Accept clients -- very important load-module module-native-protocol-unix load-module module-native-protocol-tcp ``` ```conf /etc/asound.conf pcm.mic { type hw card M96k rate 44100 format S32_LE } pcm.speaker { type plug slave { pcm "hw:1,0" } } pcm.!default { type asym capture.pcm "mic" playback.pcm "speaker" } #defaults.pcm.card 1 #defaults.ctl.card 1 ``` - [PulseAudio as a minimal unintrusive dumb pipe to ALSA](https://wiki.archlinux.org/title/PulseAudio/Examples#PulseAudio_as_a_minimal_unintrusive_dumb_pipe_to_ALSA) - [SoundcardTesting - AlsaProject](https://www.alsa-project.org/main/index.php/SoundcardTesting) - [Advanced Linux Sound Architecture/Troubleshooting - ArchWiki](https://wiki.archlinux.org/index.php/Advanced_Linux_Sound_Architecture/Troubleshooting#Microphone) - [ALSA project - the C library reference: PCM (digital audio) plugins](https://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html) - [Asoundrc - AlsaProject](https://www.alsa-project.org/wiki/Asoundrc) ## Firewall ```bash pacman -S firewalld systemctl enable --now firewalld ``` See [Introduction to Netfilter โ€“ To Linux and beyond !](https://home.regit.org/netfilter-en/netfilter/). # Maintenance ## Quick checkups ```bash htop # show task overview systemctl --failed # show failed units free -h # show memory usage lsblk -f # show disk usage networkctl status # show network status userdbctl # show users nvidia-smi # verify nvidia cards ps aux | grep "defunct" # find zombie processes ``` ## Delve into system logs ```bash journalctl -p err -b-1 -r # show error logs from previous boot in reverse order journalctl -u sshd -f # tail logs from sshd unit journalctl --no-pager -n 25 -k # show latest 25 logs from the kernel without pager journalctl --since=yesterday --until "2020-07-10 15:10:00" # show logs within specific time range journalctl CONTAINER_NAME=service_web_1 # show error from docker container named 'service_web_1' journalctl _PID=2434 -e # filter logs based on PID and jump to the end of the logs journalctl -g 'timed out' # filter logs based on regular expression. if the pattern is all lowercase, matching is case insensitive ``` - g - go to the first line - G - go to the last line - / - search for the string ## Force overriding installation ```bash pacman -S --overwrite '*' ``` ## Check memory modules ```bash pacman -S lshw dmidecode lshw -short -C memory # lists installed mems dmidecode # shows configured clock speed ``` ## File-system related issues checklist ```bash smartctl -H /dev/sdd # umount the drive before this ops e2fsck -C 0 -p /dev/sdd1 # preen e2fsck -C 0 -cc /dev/sdd1 # badblocks ``` # Common issues ## Longer SSH login (D-bus glitch) ```bash systemctl restart systemd-logind systemctl restart polkit ``` - [A comprehensive guide to fixing slow SSH logins โ€“ JRS Systems: the blog](https://jrs-s.net/2017/07/01/slow-ssh-logins/) ## Annoying `systemd-homed is not available` log messages Move `pam_unix` before `pam_systemd_home`. ```ini /etc/pam.d/system-auth #%PAM-1.0 auth required pam_faillock.so preauth # Optionally use requisite above if you do not want to prompt for the password # on locked accounts. auth [success=2 default=ignore] pam_unix.so try_first_pass nullok -auth [success=1 default=ignore] pam_systemd_home.so auth [default=die] pam_faillock.so authfail auth optional pam_permit.so auth required pam_env.so auth required pam_faillock.so authsucc # If you drop the above call to pam_faillock.so the lock will be done also # on non-consecutive authentication failures. account [success=1 default=ignore] pam_unix.so -account required pam_systemd_home.so account optional pam_permit.so account required pam_time.so password [success=1 default=ignore] pam_unix.so try_first_pass nullok shadow -password required pam_systemd_home.so password optional pam_permit.so session required pam_limits.so session required pam_unix.so session optional pam_permit.so ``` - [[solved] pam fails to find unit dbus-org.freedesktop.home1.service / Newbie Corner / Arch Linux Forums](https://bbs.archlinux.org/viewtopic.php?id=258297) ## Annoying systemd-journald-audit log ```ini /etc/systemd/journald.conf Audit=no ``` ## Missing `/dev/nvidia-{uvm*,modeset}` This occurs after updating linux kernel. - Run `docker run --rm --gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi` once. ## `[sudo] Incorrect password` while password is correct ```bash faillock --reset ``` # Useful links - [General recommendations](https://wiki.archlinux.org/index.php/General_recommendations#Users_and_groups) - [System maintenance](https://wiki.archlinux.org/index.php/System_maintenance) - [Improving performance](https://wiki.archlinux.org/index.php/Improving_performance#Know_your_system) - [General troubleshooting - ArchWiki](https://wiki.archlinux.org/title/General_troubleshooting) - [Stress testing - ArchWiki](https://wiki.archlinux.org/title/Stress_testing#Stressing_memory) - [udev - ArchWiki](https://wiki.archlinux.org/title/Udev#Debug_output) - [[HOWTO] Repair Broken system, system without a kernel / Forum & Wiki discussion / Arch Linux Forums](https://bbs.archlinux.org/viewtopic.php?id=18066) - [Archboot - ArchWiki](https://wiki.archlinux.org/title/Archboot) - [Restic Documentation โ€” restic 0.12.1 documentation](https://restic.readthedocs.io/en/stable/)