--- title: Installing Arch Linux date: 2021-02-12T00:00:00 --- This note includes all commands I typed when I set up Arch Linux on my new bare metal server. # Why I choose Arch Linux - Simple as it should be - Outstanding community efforts to maintaining package registry - Well organized wiki resources # Setup ## wipe whole disk ```bash wipefs -a /dev/sda ``` ## create partition ```bash parted ``` ```bash select /dev/sda mktable gpt mkpart EFI fat32 0 512MB # EFI mkpart Arch ext4 512MB 100% # Arch set 1 esp on # flag partition 1 as ESP quit ``` ## install file-system ```bash mkfs.vfat -F 32 /dev/sda1 # EFI mkfs.ext4 /dev/sda2 # Arch e2fsck -cc -C 0 /dev/sda2 # fsck ``` ## mount disk ```bash mkdir -p /mnt/boot mount /dev/sda2 /mnt mount /dev/sda1 /mnt/boot ``` ## install base & Linux kernel ```bash # choose between 'linux' or 'linux-lts' pacstrap /mnt base linux-lts linux-firmware genfstab -U /mnt >> /mnt/etc/fstab arch-chroot /mnt ``` ```bash pacman -S reflector reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose # optimize mirror list ``` ## install essentials ```bash pacman -S vim man-db man-pages git base-devel ``` reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose ## locale ```bash ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime hwclock --systohc vim /etc/locale.gen & locale-gen echo "LANG=en_US.UTF-8" > /etc/locale.conf ``` ## add fstab entries ```ini /etc/fstab # backup UUID= /mnt/backup ext4 defaults 0 2 # archive (do not prevent boot even if fsck fails) UUID= /mnt/archive ext4 defaults,nofail,x-systemd.device-timeout=4 0 2 ``` Find `` from the output of `lsblk -f`. ```bash findmnt --verify --verbose # verify fstab ``` ## bootloader ```bash pacman -S \ grub \ efibootmgr \ amd-ucode # AMD microcode grub-install --target=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB vim /etc/default/grub # GRUB_TIMEOUT=3 # GRUB_DISABLE_SUBMENU=y grub-mkconfig -o /boot/grub/grub.cfg ``` - [GRUB/Tips and tricks - ArchWiki](https://wiki.archlinux.org/title/GRUB/Tips_and_tricks) ## network ```bash hostnamectl set-hostname takos hostnamectl set-chassis server ``` ```ini /etc/hosts 127.0.0.1 localhost ::1 localhost 127.0.0.1 takos ``` See https://systemd.network/systemd.network.html and https://wiki.archlinux.org/title/Systemd-networkd. ```ini /etc/systemd/network/wired.network [Match] Name=enp5s0 [Network] #DHCP=yes Address=10.0.1.2/24 Gateway=10.0.1.1 DNS=10.0.1.100 # self-hosted DNS resolver DNS=1.1.1.1 # Cloudflare for the fallback DNS server MACVLAN=dns-shim # to handle local dns lookup to 10.0.1.100 which is managed by Docker macvlan driver ``` ```ini /etc/systemd/network/dns-shim.netdev # to handle local dns lookup to 10.0.1.100 [NetDev] Name=dns-shim Kind=macvlan [MACVLAN] Mode=bridge ``` ```ini /etc/systemd/network/dns-shim.network # to handle local dns lookup to 10.0.1.100 [Match] Name=dns-shim [Network] IPForward=yes [Address] Address=10.0.1.103/32 Scope=link [Route] Destination=10.0.1.100/30 ``` `ip` equivalent to the above settings: ```bash ip link add dns-shim link enp5s0 type macvlan mode bridge # add macvlan shim ip a add 10.0.1.103/32 dev dns-shim # assign host ip to shim defined in docker-compose.yml ip link set dns-shim up # enable interface ip route add 10.0.1.100/30 dev dns-shim # route macvlan subnet to shim interface ``` ```bash systemctl enable --now systemd-networkd networkctl status ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf # for self-hosted dns resolver sed -r -i -e 's/#?DNSStubListener=yes/DNSStubListener=no/g' -e 's/#DNS=/DNS=10.0.1.100/g' /etc/systemd/resolved.conf systemctl enable --now systemd-resolved resolvectl status resolvectl query ddg.gg drill ddg.gg ``` If `networkctl` keep showing `enp5s0` as `degraded`, then run `ip addr add 10.0.1.2/24 dev enp5s0 ` to manually assign static IP address for the workaround. ## finalize ```bash exit # leave chroot umount -R /mnt reboot ``` ## NTP ```bash timedatectl set-ntp true timedatectl status ``` ## shell ```bash pacman -S zsh chsh -s /bin/zsh git clone https://github.com/uetchy/dotfiles ~/.dotfiles yay -S ruby pyenv exa antibody direnv fd ripgrep fzy peco ghq-bin hub neofetch tmux git-delta lazygit jq lostfiles ncdu htop rsync youtube-dl prettier tree age informant usermod -aG informant cd ~/.dotfiles ./dot link zsh -f reload ``` ## user ```bash passwd # change root passwd useradd -m -s /bin/zsh # add local user passwd # change local user password userdbctl # verify users userdbctl group # verify groups pacman -S sudo echo "%sudo ALL=(ALL) NOPASSWD:/usr/bin/pacman" > /etc/sudoers.d/pacman # allow users in sudo group to run pacman without password (optional) groupadd sudo usermod -aG sudo # add local user to sudo group visudo -c ``` ## ssh ```bash pacman -S openssh vim /etc/ssh/sshd_config systemctl enable --now sshd ``` on the host machine: ```bash ssh-copy-id @ ``` ## AUR ```bash git clone https://aur.archlinux.org/yay.git cd yay makepkg -si ``` ## smartd ```bash pacman -S smartmontools systemctl enable --now smartd smartctl -t short /dev/sda smartctl -l selftest /dev/sda ``` ## nvidia ```bash pacman -S nvidia-lts # 'nvidia' for 'linux' package reboot nvidia-smi # test runtime ``` ## docker https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html ```bash pacman -S docker docker-compose yay -S nvidia-container-runtime systemctl enable --now docker ``` ```json /etc/docker/daemon.json { "log-driver": "json-file", "log-opts": { "max-size": "10m", // default: -1 (unlimited) "max-file": "3" // default: 1 }, "runtimes": { // for docker-compose "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } } ``` ```bash systemctl restart docker usermod -aG docker # to create mandatory device files on /dev docker run --gpus all nvidia/cuda:10.2-cudnn7-runtime nvidia-smi GPU_OPTS=(--gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools) docker run --rm -it ${GPU_OPTS} nvidia/cuda:10.2-cudnn7-runtime nvidia-smi docker run --rm -it ${GPU_OPTS} tensorflow/tensorflow:1.14.0-gpu-py3 bash docker create network webproxy ``` ### Use `journald` log driver in Docker Compose ```yaml services: web: logging: driver: "journald" options: tag: "{{.ImageName}}/{{.Name}}/{{.ID}}" # default: "{{.ID}}" ``` - [Configure logging drivers | Docker Documentation](https://docs.docker.com/config/containers/logging/configure/) # Additional setup ## fail2ban ``` pacman -S fail2ban ``` ```ini /etc/fail2ban/filter.d/bad-auth.conf [INCLUDES] before = common.conf [Definition] failregex = .* client login failed: .+ client:\ ignoreregex = ``` ```ini /etc/fail2ban/jail.local [DEFAULT] ignoreip = 127.0.0.1/8 10.0.1.0/24 [sshd] enabled = true port = 22,10122 bantime = 1h mode = aggressive # https://github.com/Mailu/Mailu/blob/master/docs/faq.rst#do-you-support-fail2ban [mailu] enabled = true backend = systemd journalmatch = CONTAINER_NAME=mail_front_1 filter = bad-auth findtime = 1h maxretry = 3 bantime = 1w chain = DOCKER-USER banaction = iptables-allports ``` ```patch /etc/systemd/system/fail2ban.service - After=network.target iptables.service firewalld.service ip6tables.service ipset.service nftables.service + After=network.target iptables.service firewalld.service ip6tables.service ipset.service nftables.service docker.service ``` ```bash systemctl enable --now fail2ban fail2ban-client status mailu ``` ## telegraf ```bash yay -S telegraf ``` ```ini /etc/telegraf/telegraf.conf # Global tags can be specified here in key="value" format. [global_tags] # Configuration for telegraf agent [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" hostname = "" omit_hostname = false # Read InfluxDB-formatted JSON metrics from one or more HTTP endpoints [[outputs.influxdb]] urls = ["http://127.0.0.1:8086"] database = "" username = "" password = "" # Read metrics about cpu usage [[inputs.cpu]] percpu = true totalcpu = true collect_cpu_time = false report_active = false # Read metrics about disk usage by mount point [[inputs.disk]] ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"] # Read metrics about disk IO by device [[inputs.diskio]] # Get kernel statistics from /proc/stat [[inputs.kernel]] # Read metrics about memory usage [[inputs.mem]] # Get the number of processes and group them by status [[inputs.processes]] # Read metrics about system load & uptime [[inputs.system]] # Read metrics about network interface usage [[inputs.net]] interfaces = ["enp5s0"] # Read metrics about docker containers [[inputs.docker]] endpoint = "unix:///var/run/docker.sock" perdevice = false total = true [[inputs.fail2ban]] interval = "15m" use_sudo = true # Pulls statistics from nvidia GPUs attached to the host [[inputs.nvidia_smi]] timeout = "30s" [[inputs.http_response]] interval = "5m" urls = [ "https://example.com" ] # Monitor sensors, requires lm-sensors package [[inputs.sensors]] interval = "60s" remove_numbers = false ``` ```ini /etc/sudoers.d/telegraf Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status * telegraf ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN Defaults!FAIL2BAN !logfile, !syslog, !pam_session ``` ```bash chmod 440 /etc/sudoers.d/telegraf usermod -aG docker telegraf telegraf -config /etc/telegraf/telegraf.conf -test systemctl enable --now telegraf ``` ## cfddns Dynamic DNS for Cloudflare. ``` yay -S cfddns sendmail ``` ```yml /etc/cfddns/cfddns.yml token: notification: enabled: true from: cfddns@localhost to: me@example.com ``` ```ini /etc/cfddns/domains example.com ``` ``` systemctl enable --now cfddns ``` ## backup ```bash pacman -S borg ``` ```ini /etc/backups/borg.service [Unit] Description=Borg Daily Backup Service [Service] Type=simple Nice=19 IOSchedulingClass=2 IOSchedulingPriority=7 ExecStart=/etc/backups/run.sh ``` ```ini /etc/backups/borg.timer [Unit] Description=Borg Daily Backup Timer [Timer] WakeSystem=false OnCalendar=*-*-* 03:00 RandomizedDelaySec=10min [Install] WantedBy=timers.target ``` ```bash /etc/backups/run.sh #!/bin/bash -ue # The udev rule is not terribly accurate and may trigger our service before # the kernel has finished probing partitions. Sleep for a bit to ensure # the kernel is done. sleep 5 # # Script configuration # export BORG_PASSPHRASE="" MOUNTPOINT=/mnt/backup TARGET=$MOUNTPOINT/borg # Archive name schema DATE=$(date --iso-8601) # # Create backups # # Options for borg create BORG_OPTS="--stats --compression lz4 --checkpoint-interval 86400" # No one can answer if Borg asks these questions, it is better to just fail quickly instead of hanging. export BORG_RELOCATED_REPO_ACCESS_IS_OK=no export BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=no # Log Borg version borg --version echo "Starting backup for $DATE" echo "# system" borg create $BORG_OPTS \ --exclude /root/.cache \ --exclude /root/.pyenv \ --exclude /root/.vscode-server \ --exclude /root/.local/share/TabNine \ --exclude 'sh:/home/*/.cache' \ --exclude 'sh:/home/*/.cargo' \ --exclude 'sh:/home/*/.pyenv' \ --exclude 'sh:/home/*/.vscode-server' \ --exclude 'sh:/home/*/.local/share/TabNine' \ --one-file-system \ $TARGET::'system-{now}' \ /etc /boot /home /root /srv echo "# data" borg create $BORG_OPTS \ --exclude 'sh:/mnt/data/nextcloud/appdata_*/preview' \ --exclude 'sh:/mnt/data/nextcloud/appdata_*/dav-photocache' \ $TARGET::'data-{now}' \ /mnt/data echo "# ftl" borg create $BORG_OPTS \ $TARGET::'ftl-{now}' \ /mnt/ftl echo "Start pruning" BORG_PRUNE_OPTS_NORMAL="--list --stats --keep-daily 7 --keep-weekly 3 --keep-monthly 3" borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'system-' $TARGET borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'data-' $TARGET borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'ftl-' $TARGET echo "Completed backup for $DATE" # Just to be completely paranoid sync ``` ```bash ln -sf /etc/backups/borg.{service,timer} /etc/systemd/system/ systemctl enable --now borg ``` ## Kubernetes ```bash pacman -S kubeadm kubelet kubectl systemctl enable --now kubelet kubeadm init --pod-network-cidr='10.244.0.0/16' cp /etc/kubernetes/admin.conf ~/.kube/config kubectl taint nodes --all node-role.kubernetes.io/master- # to allow allocating pods to the master node # setup flannel network manager kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml # setup nginx ingress controller # TODO kubectl cluster-info kubectl get nodes kubectl get pods -A kubectl get cm -n kube-system kubeadm-config -o yaml ``` - [Kubernetes - ArchWiki](https://wiki.archlinux.org/index.php/Kubernetes) - [Kubernetes Ingress Controller with NGINX Reverse Proxy and Wildcard SSL from Let's Encrypt - Shogan.tech](https://www.shogan.co.uk/kubernetes/kubernetes-ingress-controller-with-nginx-reverse-proxy-and-wildcard-ssl-from-lets-encrypt/) ## wildcard certs ```bash pacman -S certbot certbot-dns-cloudflare echo "dns_cloudflare_api_token = " > ~/.secrets/certbot/cloudflare.ini chmod 600 ~/.secrets/certbot/cloudflare.ini certbot certonly \ --email y@uechi.io \ --agree-tos \ --dns-cloudflare \ --dns-cloudflare-credentials ~/.secrets/certbot/cloudflare.ini \ -d "*.uechi.io" openssl x509 -in /etc/letsencrypt/live/uechi.io/fullchain.pem -text certbot certificates ``` ```ini /etc/systemd/system/certbot.service [Unit] Description=Let's Encrypt renewal [Service] Type=oneshot ExecStart=/usr/bin/certbot renew --quiet --agree-tos --deploy-hook "docker exec nginx-proxy-le /app/signal_le_service" ``` ```ini /etc/systemd/system/certbot.timer [Unit] Description=Twice daily renewal of Let's Encrypt's certificates [Timer] OnCalendar=0/12:00:00 RandomizedDelaySec=1h Persistent=true [Install] WantedBy=timers.target ``` - [Certbot - ArchWiki](https://wiki.archlinux.org/index.php/Certbot) - [Welcome to certbot-dns-cloudflare’s documentation! — certbot-dns-cloudflare 0 documentation](https://certbot-dns-cloudflare.readthedocs.io/en/stable/) - [docker-letsencrypt-nginx-proxy-companion/Standalone-certificates.md at master · nginx-proxy/docker-letsencrypt-nginx-proxy-companion](https://github.com/nginx-proxy/docker-letsencrypt-nginx-proxy-companion/blob/master/docs/Standalone-certificates.md) ## audio ```bash pacman -S alsa-utils # maybe requires reboot usermod -aG audio # list devices as root aplay -l arecord -L cat /proc/asound/cards # test speaker speaker-test -c2 # test mic arecord -vv -Dhw:2,0 -fS32_LE mic.wav aplay mic.wav # gui mixer alsamixer # for Mycroft.ai pacman -S pulseaudio pulsemixer pulseaudio --start pacmd list-cards ``` ```conf /etc/pulse/default.pa # INPUT/RECORD load-module module-alsa-source device="default" tsched=1 # OUTPUT/PLAYBACK load-module module-alsa-sink device="default" tsched=1 # Accept clients -- very important load-module module-native-protocol-unix load-module module-native-protocol-tcp ``` ```conf /etc/asound.conf pcm.mic { type hw card M96k rate 44100 format S32_LE } pcm.speaker { type plug slave { pcm "hw:1,0" } } pcm.!default { type asym capture.pcm "mic" playback.pcm "speaker" } #defaults.pcm.card 1 #defaults.ctl.card 1 ``` - [PulseAudio as a minimal unintrusive dumb pipe to ALSA](https://wiki.archlinux.org/title/PulseAudio/Examples#PulseAudio_as_a_minimal_unintrusive_dumb_pipe_to_ALSA) - [SoundcardTesting - AlsaProject](https://www.alsa-project.org/main/index.php/SoundcardTesting) - [Advanced Linux Sound Architecture/Troubleshooting - ArchWiki](https://wiki.archlinux.org/index.php/Advanced_Linux_Sound_Architecture/Troubleshooting#Microphone) - [ALSA project - the C library reference: PCM (digital audio) plugins](https://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html) - [Asoundrc - AlsaProject](https://www.alsa-project.org/wiki/Asoundrc) ## firewall ```bash pacman -S firewalld # TODO ``` See [Introduction to Netfilter – To Linux and beyond !](https://home.regit.org/netfilter-en/netfilter/). # Maintenance ## quick checkups ```bash htop # show task overview systemctl --failed # show failed units free -h # show memory usage lsblk -f # show disk usage networkctl status # show network status userdbctl # show users nvidia-smi # verify nvidia cards ps aux | grep "defunct" # find zombie processes ``` ## analyze logs ```bash journalctl -p err -b-1 -r # show error logs from previous boot in reverse order journalctl -u sshd -f # tail logs from sshd unit journalctl --no-pager -n 25 -k # show latest 25 logs from the kernel without pager journalctl --since=yesterday --until "2020-07-10 15:10:00" # show logs within specific time range journalctl CONTAINER_NAME=service_web_1 # show error from docker container named 'service_web_1' journalctl _PID=2434 -e # filter logs based on PID and jump to the end of the logs journalctl -g 'timed out' # filter logs based on regular expression. if the pattern is all lowercase, matching is case insensitive ``` ``` g - go to the first line G - go to the last line / - search for the string ``` ## force override installation ```bash pacman -S --overwrite '*' ``` ## fs issue checklist ```bash smartctl -H /dev/sdd # umount before this ops e2fsck -C 0 -p /dev/sdd1 # preen e2fsck -C 0 -cc /dev/sdd1 # badblocks ``` # Common Issues ## Longer SSH login (D-bus glitch) ```bash systemctl restart systemd-logind systemctl restart polkit ``` - [A comprehensive guide to fixing slow SSH logins – JRS Systems: the blog](https://jrs-s.net/2017/07/01/slow-ssh-logins/) ## Annoying `systemd-homed is not available` log messages Move `pam_unix` before `pam_systemd_home`. ```ini /etc/pam.d/system-auth #%PAM-1.0 auth required pam_faillock.so preauth # Optionally use requisite above if you do not want to prompt for the password # on locked accounts. auth [success=2 default=ignore] pam_unix.so try_first_pass nullok -auth [success=1 default=ignore] pam_systemd_home.so auth [default=die] pam_faillock.so authfail auth optional pam_permit.so auth required pam_env.so auth required pam_faillock.so authsucc # If you drop the above call to pam_faillock.so the lock will be done also # on non-consecutive authentication failures. account [success=1 default=ignore] pam_unix.so -account required pam_systemd_home.so account optional pam_permit.so account required pam_time.so password [success=1 default=ignore] pam_unix.so try_first_pass nullok shadow -password required pam_systemd_home.so password optional pam_permit.so session required pam_limits.so session required pam_unix.so session optional pam_permit.so ``` - [[solved] pam fails to find unit dbus-org.freedesktop.home1.service / Newbie Corner / Arch Linux Forums](https://bbs.archlinux.org/viewtopic.php?id=258297) ## Annoying systemd-journald-audit log ```ini /etc/systemd/journald.conf Audit=no ``` ## Missing `/dev/nvidia-{uvm*,modeset}` This occurs after updating linux kernel. - Run `docker --rm --gpus all -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi` once. # Useful links - [General recommendations](https://wiki.archlinux.org/index.php/General_recommendations#Users_and_groups) - [System maintenance](https://wiki.archlinux.org/index.php/System_maintenance) - [Improving performance](https://wiki.archlinux.org/index.php/Improving_performance#Know_your_system) - [General troubleshooting - ArchWiki](https://wiki.archlinux.org/title/General_troubleshooting) - [Stress testing - ArchWiki](https://wiki.archlinux.org/title/Stress_testing#Stressing_memory) - [udev - ArchWiki](https://wiki.archlinux.org/title/Udev#Debug_output) - [[HOWTO] Repair Broken system, system without a kernel / Forum & Wiki discussion / Arch Linux Forums](https://bbs.archlinux.org/viewtopic.php?id=18066) - [Archboot - ArchWiki](https://wiki.archlinux.org/title/Archboot) - [Restoring with the Borg](https://blog.jamesthebard.net/restoring-with-the-borg/) - [Restore with Borg | BorgBase Docs](https://docs.borgbase.com/restore/borg/)