diff --git a/source/_posts/2021/installing-arch-linux.md b/source/_posts/2021/installing-arch-linux.md index 941235f..237c93a 100644 --- a/source/_posts/2021/installing-arch-linux.md +++ b/source/_posts/2021/installing-arch-linux.md @@ -39,6 +39,8 @@ quit ```bash mkfs.vfat -F 32 /dev/sda1 # EFI mkfs.ext4 /dev/sda2 # Arch + +e2fsck -cc -C 0 /dev/sda2 # fsck ``` ## mount disk @@ -52,22 +54,32 @@ mount /dev/sda1 /mnt/boot ## install base & Linux kernel ```bash -reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose # optimize mirror list - # choose between 'linux' or 'linux-lts' -pacstrap /mnt base linux linux-firmware -# base-devel need to be included as well? +pacstrap /mnt base linux-lts linux-firmware genfstab -U /mnt >> /mnt/etc/fstab arch-chroot /mnt ``` ```bash -pacman -Syu # upgrade -pacman -Qe # list explicitly installed pkgs -pacman -Rs # remove pkg and its deps -pacman -Qtd # list orphans +pacman -S reflector +reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose # optimize mirror list +``` -pacman -S man-db man-pages git informant +## install essentials + +```bash +pacman -S vim man-db man-pages git base-devel +``` + +reflector --protocol https --latest 30 --sort rate --save /etc/pacman.d/mirrorlist --verbose + +## locale + +```bash +ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime +hwclock --systohc +vim /etc/locale.gen & locale-gen +echo "LANG=en_US.UTF-8" > /etc/locale.conf ``` ## add fstab entries @@ -103,36 +115,20 @@ grub-mkconfig -o /boot/grub/grub.cfg - [GRUB/Tips and tricks - ArchWiki](https://wiki.archlinux.org/title/GRUB/Tips_and_tricks) -## NTP - -```bash -sed -i -e 's/#NTP=/NTP=0.arch.pool.ntp.org 1.arch.pool.ntp.org 2.arch.pool.ntp.org 3.arch.pool.ntp.org/' -e 's/#Fall/Fall/' /etc/systemd/timesyncd.conf -systemctl enable --now systemd-timesyncd -``` - -## locale - -```bash -ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime -hwclock --systohc -vim /etc/locale.gen & locale-gen -echo "LANG=en_US.UTF-8" > /etc/locale.conf -``` - ## network ```bash -hostnamectl set-hostname polka +hostnamectl set-hostname takos hostnamectl set-chassis server ``` ```ini /etc/hosts 127.0.0.1 localhost ::1 localhost -127.0.0.1 polka +127.0.0.1 takos ``` -See https://systemd.network/systemd.network.html. +See https://systemd.network/systemd.network.html and https://wiki.archlinux.org/title/Systemd-networkd. ```ini /etc/systemd/network/wired.network [Match] @@ -185,42 +181,45 @@ ip route add 10.0.1.100/30 dev dns-shim # route macvlan subnet to shim interface ```bash systemctl enable --now systemd-networkd networkctl status +ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf # for self-hosted dns resolver sed -r -i -e 's/#?DNSStubListener=yes/DNSStubListener=no/g' -e 's/#DNS=/DNS=10.0.1.100/g' /etc/systemd/resolved.conf -ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf systemctl enable --now systemd-resolved resolvectl status resolvectl query ddg.gg -drill @10.0.1.100 ddg.gg - -# FIXME -pacman -S wpa_supplicant -vim /etc/wpa_supplicant/wpa_supplicant.conf -# ctrl_interface=/run/wpa_supplicant -# update_config=1 -wpa_supplicant -B -i wlp8s0 -c /etc/wpa_supplicant/wpa_supplicant.conf -wpa_cli # default control socket -> /var/run/wpa_supplicant -modinfo iwlwifi +drill ddg.gg ``` If `networkctl` keep showing `enp5s0` as `degraded`, then run `ip addr add 10.0.1.2/24 dev enp5s0 ` to manually assign static IP address for the workaround. -## firewall +## finalize ```bash -pacman -S firewalld -# TODO +exit # leave chroot +umount -R /mnt +reboot ``` -See also [Introduction to Netfilter – To Linux and beyond !](https://home.regit.org/netfilter-en/netfilter/) +## NTP + +```bash +timedatectl set-ntp true +timedatectl status +``` ## shell ```bash pacman -S zsh chsh -s /bin/zsh +git clone https://github.com/uetchy/dotfiles ~/.dotfiles +yay -S ruby pyenv exa antibody direnv fd ripgrep fzy peco ghq-bin hub neofetch tmux git-delta lazygit jq lostfiles ncdu htop rsync youtube-dl prettier tree age informant +usermod -aG informant +cd ~/.dotfiles +./dot link zsh -f +reload ``` ## user @@ -228,14 +227,16 @@ chsh -s /bin/zsh ```bash passwd # change root passwd -useradd -m -s /bin/zsh uetchy # add local user -passwd uetchy # change local user password +useradd -m -s /bin/zsh # add local user +passwd # change local user password userdbctl # verify users +userdbctl group # verify groups pacman -S sudo echo "%sudo ALL=(ALL) NOPASSWD:/usr/bin/pacman" > /etc/sudoers.d/pacman # allow users in sudo group to run pacman without password (optional) -usermod -aG sudo uetchy # add local user to sudo group +groupadd sudo +usermod -aG sudo # add local user to sudo group visudo -c ``` @@ -250,7 +251,7 @@ systemctl enable --now sshd on the host machine: ```bash -ssh-copy-id uetchy@10.0.1.2 +ssh-copy-id @ ``` ## AUR @@ -261,22 +262,21 @@ cd yay makepkg -si ``` -## finalize +## smartd ```bash -exit # leave chroot -umount -R /mnt -reboot -``` +pacman -S smartmontools +systemctl enable --now smartd -# Additional setup +smartctl -t short /dev/sda +smartctl -l selftest /dev/sda +``` ## nvidia ```bash -pacman -S nvidia # 'nvidia-lts' for linux-lts -cat /var/lib/modprobe.d/nvidia.conf # ensure having 'blacklist nouveau' - +pacman -S nvidia-lts # 'nvidia' for 'linux' package +reboot nvidia-smi # test runtime ``` @@ -287,6 +287,7 @@ https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview. ```bash pacman -S docker docker-compose yay -S nvidia-container-runtime +systemctl enable --now docker ``` ```json /etc/docker/daemon.json @@ -296,7 +297,6 @@ yay -S nvidia-container-runtime "max-size": "10m", // default: -1 (unlimited) "max-file": "3" // default: 1 }, - "exec-opts": ["native.cgroupdriver=systemd"], // for kubernetes "runtimes": { // for docker-compose "nvidia": { @@ -308,14 +308,18 @@ yay -S nvidia-container-runtime ``` ```bash -systemctl enable --now docker +systemctl restart docker -groupadd docker -usermod -aG docker user +usermod -aG docker + +# to create mandatory device files on /dev +docker run --gpus all nvidia/cuda:10.2-cudnn7-runtime nvidia-smi GPU_OPTS=(--gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools) docker run --rm -it ${GPU_OPTS} nvidia/cuda:10.2-cudnn7-runtime nvidia-smi docker run --rm -it ${GPU_OPTS} tensorflow/tensorflow:1.14.0-gpu-py3 bash + +docker create network webproxy ``` ### Use `journald` log driver in Docker Compose @@ -331,22 +335,7 @@ services: - [Configure logging drivers | Docker Documentation](https://docs.docker.com/config/containers/logging/configure/) -## Telegraf - -```bash -yay -S telegraf -vim /etc/telegraf/telegraf.conf -``` - -```ini /etc/sudoers.d/telegraf -Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status * -telegraf ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN -Defaults!FAIL2BAN !logfile, !syslog, !pam_session -``` - -```bash -chmod 440 /etc/sudoers.d/telegraf -``` +# Additional setup ## fail2ban @@ -365,13 +354,12 @@ ignoreregex = ```ini /etc/fail2ban/jail.local [DEFAULT] -bantime = 120m ignoreip = 127.0.0.1/8 10.0.1.0/24 [sshd] enabled = true port = 22,10122 -maxretry = 3 +bantime = 1h mode = aggressive # https://github.com/Mailu/Mailu/blob/master/docs/faq.rst#do-you-support-fail2ban @@ -382,9 +370,9 @@ journalmatch = CONTAINER_NAME=mail_front_1 filter = bad-auth findtime = 1h maxretry = 3 -bantime = 3d -banaction = iptables-allports +bantime = 1w chain = DOCKER-USER +banaction = iptables-allports ``` ```patch /etc/systemd/system/fail2ban.service @@ -397,6 +385,105 @@ systemctl enable --now fail2ban fail2ban-client status mailu ``` +## telegraf + +```bash +yay -S telegraf +``` + +```ini /etc/telegraf/telegraf.conf +# Global tags can be specified here in key="value" format. +[global_tags] + +# Configuration for telegraf agent +[agent] +interval = "10s" +round_interval = true +metric_batch_size = 1000 +metric_buffer_limit = 10000 +collection_jitter = "0s" +flush_interval = "10s" +flush_jitter = "0s" +precision = "" +hostname = "" +omit_hostname = false + +# Read InfluxDB-formatted JSON metrics from one or more HTTP endpoints +[[outputs.influxdb]] +urls = ["http://127.0.0.1:8086"] +database = "" +username = "" +password = "" + +# Read metrics about cpu usage +[[inputs.cpu]] +percpu = true +totalcpu = true +collect_cpu_time = false +report_active = false + +# Read metrics about disk usage by mount point +[[inputs.disk]] +ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"] + +# Read metrics about disk IO by device +[[inputs.diskio]] + +# Get kernel statistics from /proc/stat +[[inputs.kernel]] + +# Read metrics about memory usage +[[inputs.mem]] + +# Get the number of processes and group them by status +[[inputs.processes]] + +# Read metrics about system load & uptime +[[inputs.system]] + +# Read metrics about network interface usage +[[inputs.net]] +interfaces = ["enp5s0"] + +# Read metrics about docker containers +[[inputs.docker]] +endpoint = "unix:///var/run/docker.sock" +perdevice = false +total = true + +[[inputs.fail2ban]] +interval = "15m" +use_sudo = true + +# Pulls statistics from nvidia GPUs attached to the host +[[inputs.nvidia_smi]] +timeout = "30s" + +[[inputs.http_response]] +interval = "5m" +urls = [ + "https://example.com" +] + +# Monitor sensors, requires lm-sensors package +[[inputs.sensors]] +interval = "60s" +remove_numbers = false +``` + +```ini /etc/sudoers.d/telegraf +Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status * +telegraf ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN +Defaults!FAIL2BAN !logfile, !syslog, !pam_session +``` + +```bash +chmod 440 /etc/sudoers.d/telegraf +usermod -aG docker telegraf +telegraf -config /etc/telegraf/telegraf.conf -test +systemctl enable --now telegraf +``` + ## cfddns Dynamic DNS for Cloudflare. @@ -407,27 +494,20 @@ yay -S cfddns sendmail ```yml /etc/cfddns/cfddns.yml token: +notification: + enabled: true + from: cfddns@localhost + to: me@example.com ``` ```ini /etc/cfddns/domains -uechi.io -datastore.uechi.io +example.com ``` ``` systemctl enable --now cfddns ``` -## smart - -```bash -pacman -S smartmontools -systemctl enable --now smartd - -smartctl -t short /dev/sdc -smartctl -l selftest /dev/sdc -``` - ## backup ```bash @@ -465,9 +545,6 @@ WantedBy=timers.target # The udev rule is not terribly accurate and may trigger our service before # the kernel has finished probing partitions. Sleep for a bit to ensure # the kernel is done. -# -# This can be avoided by using a more precise udev rule, e.g. matching -# a specific hardware path and partition. sleep 5 # @@ -487,8 +564,7 @@ DATE=$(date --iso-8601) # Options for borg create BORG_OPTS="--stats --compression lz4 --checkpoint-interval 86400" -# No one can answer if Borg asks these questions, it is better to just fail quickly -# instead of hanging. +# No one can answer if Borg asks these questions, it is better to just fail quickly instead of hanging. export BORG_RELOCATED_REPO_ACCESS_IS_OK=no export BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=no @@ -499,8 +575,6 @@ echo "Starting backup for $DATE" echo "# system" borg create $BORG_OPTS \ - --exclude /var/cache \ - --exclude /var/lib/docker/devicemapper \ --exclude /root/.cache \ --exclude /root/.pyenv \ --exclude /root/.vscode-server \ @@ -511,33 +585,26 @@ borg create $BORG_OPTS \ --exclude 'sh:/home/*/.vscode-server' \ --exclude 'sh:/home/*/.local/share/TabNine' \ --one-file-system \ - $TARGET::'{hostname}-system-{now}' \ - / /boot + $TARGET::'system-{now}' \ + /etc /boot /home /root /srv echo "# data" borg create $BORG_OPTS \ --exclude 'sh:/mnt/data/nextcloud/appdata_*/preview' \ --exclude 'sh:/mnt/data/nextcloud/appdata_*/dav-photocache' \ - $TARGET::'{hostname}-data-{now}' \ + $TARGET::'data-{now}' \ /mnt/data -echo "# archive" -borg create $BORG_OPTS \ - $TARGET::'{hostname}-archive-{now}' \ - /mnt/archive - echo "# ftl" borg create $BORG_OPTS \ - $TARGET::'{hostname}-ftl-{now}' \ + $TARGET::'ftl-{now}' \ /mnt/ftl echo "Start pruning" -BORG_PRUNE_OPTS_NORMAL="--list --stats --keep-daily 7 --keep-weekly 3 --keep-monthly 2" -BORG_PRUNE_OPTS_LESS="--list --stats --keep-daily 3 --keep-weekly 1 --keep-monthly 1" -borg prune $BORG_PRUNE_OPTS_NORMAL --prefix '{hostname}-system-' $TARGET -borg prune $BORG_PRUNE_OPTS_NORMAL --prefix '{hostname}-archive-' $TARGET -borg prune $BORG_PRUNE_OPTS_LESS --prefix '{hostname}-data-' $TARGET -borg prune $BORG_PRUNE_OPTS_LESS --prefix '{hostname}-ftl-' $TARGET +BORG_PRUNE_OPTS_NORMAL="--list --stats --keep-daily 7 --keep-weekly 3 --keep-monthly 3" +borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'system-' $TARGET +borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'data-' $TARGET +borg prune $BORG_PRUNE_OPTS_NORMAL --prefix 'ftl-' $TARGET echo "Completed backup for $DATE" @@ -621,7 +688,7 @@ WantedBy=timers.target ```bash pacman -S alsa-utils # maybe requires reboot -usermod -aG audio uetchy +usermod -aG audio # list devices as root aplay -l @@ -685,27 +752,62 @@ pcm.!default { - [ALSA project - the C library reference: PCM (digital audio) plugins](https://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html) - [Asoundrc - AlsaProject](https://www.alsa-project.org/wiki/Asoundrc) -# Maintenance - -## system healthcheck +## firewall ```bash +pacman -S firewalld +# TODO +``` + +See [Introduction to Netfilter – To Linux and beyond !](https://home.regit.org/netfilter-en/netfilter/). + +# Maintenance + +## quick checkups + +```bash +htop # show task overview systemctl --failed # show failed units free -h # show memory usage lsblk -f # show disk usage networkctl status # show network status userdbctl # show users nvidia-smi # verify nvidia cards -htop # show task overview -neofetch # show system info +ps aux | grep "defunct" # find zombie processes ``` -## analyzing logs +## analyze logs ```bash journalctl -p err -b-1 -r # show error logs from previous boot in reverse order +journalctl -u sshd -f # tail logs from sshd unit +journalctl --no-pager -n 25 -k # show latest 25 logs from the kernel without pager +journalctl --since=yesterday --until "2020-07-10 15:10:00" # show logs within specific time range journalctl CONTAINER_NAME=service_web_1 # show error from docker container named 'service_web_1' -journalctl -u docker -f # tail docker logs +journalctl _PID=2434 -e # filter logs based on PID and jump to the end of the logs +journalctl -g 'timed out' # filter logs based on regular expression. if the pattern is all lowercase, matching is case insensitive +``` + +``` +g - go to the first line +G - go to the last line +/ - search for the string +``` + +## force override installation + +```bash +pacman -S --overwrite '*' +``` + +## fs issue checklist + +```bash +smartctl -H /dev/sdd + +# umount before this ops +e2fsck -C 0 -p /dev/sdd1 # preen +e2fsck -C 0 -cc /dev/sdd1 # badblocks ``` # Common Issues @@ -764,12 +866,17 @@ Audit=no This occurs after updating linux kernel. -1. Reinstall `nvidia-container-runtime`. -2. Run `docker --rm --gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi` once. +- Run `docker --rm --gpus all -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi` once. # Useful links - [General recommendations](https://wiki.archlinux.org/index.php/General_recommendations#Users_and_groups) - [System maintenance](https://wiki.archlinux.org/index.php/System_maintenance) - [Improving performance](https://wiki.archlinux.org/index.php/Improving_performance#Know_your_system) -- [Benchmarking - ArchWiki](https://wiki.archlinux.org/index.php/Benchmarking) +- [General troubleshooting - ArchWiki](https://wiki.archlinux.org/title/General_troubleshooting) +- [Stress testing - ArchWiki](https://wiki.archlinux.org/title/Stress_testing#Stressing_memory) +- [udev - ArchWiki](https://wiki.archlinux.org/title/Udev#Debug_output) +- [[HOWTO] Repair Broken system, system without a kernel / Forum & Wiki discussion / Arch Linux Forums](https://bbs.archlinux.org/viewtopic.php?id=18066) +- [Archboot - ArchWiki](https://wiki.archlinux.org/title/Archboot) +- [Restoring with the Borg](https://blog.jamesthebard.net/restoring-with-the-borg/) +- [Restore with Borg | BorgBase Docs](https://docs.borgbase.com/restore/borg/)