In this Blog post I will explore the process of setting up a minimal rust project for embedded baremetal development. The focus is on getting up and running including setting up a debugging flow for instruction stepping fully integrated into my editor of choice (emacs).

As of 2025 many popular microcontrollers such as the esp32 already have a big rust community and some easy templates to get up and running with rust. But even less popular devices already have basic infrastructure in place, though setting up a project to do development on them is a bit more of an exploratory experience. Still a lot of the heavy lifting has already been solved. I still remember when first started doing some embedded rust development in 2019, there was little to build upon, so getting started involved bootstrapping a peripheral access crate and start writing a hardware abstraction layer. That is to say that within the last six years the ecosystem for embedded rust projects has evolved impressively.

The device this blog post is about is the ch32v203 RISCV microcontroller developed by WCH. Searching for “ch32-hal rust” on Google quickly finds a hal with embassy support: ch32-hal. Great!

Let us look at the examples folder. Within the folder for the device I some basic examples are provided:

examples/ch32v203/src/bin
├── adc.rs
├── blinky.rs
├── flash_sections.rs
├── sdi_print.rs
├── spi-lcd-st7735.rs
└── uart.rs

That looks promising, apart from the hello world of embedded programming blinky there are some more complex examples for running an LCD display, doing analog measurements and getting up and running with UART.

 1// uart.rs
 2
 3#![no_std]
 4#![no_main]
 5#![feature(type_alias_impl_trait)]
 6#![feature(impl_trait_in_assoc_type)]
 7
 8use ch32_hal::usart;
 9use embassy_executor::Spawner;
10use embassy_time::{Duration, Timer};
11use hal::gpio::{AnyPin, Level, Output, Pin};
12use hal::usart::UartTx;
13use {ch32_hal as hal, panic_halt as _};
14
15#[embassy_executor::main(entry = "qingke_rt::entry")]
16async fn main(spawner: Spawner) -> ! {
17    let p = hal::init(Default::default());
18
19    let mut led = Output::new(p.PB8, Level::Low, Default::default());
20
21    let mut cfg = usart::Config::default();
22    let mut uart = UartTx::new_blocking(p.USART1, p.PA9, cfg).unwrap();
23
24    loop {
25        Timer::after_millis(1000).await;
26
27        uart.blocking_write(b"hello world from embassy main\r\n");
28
29        led.toggle();
30    }
31}

The UART example is a good starting point for a small project, there is an entry point from which asyncronous tasks could be spawned and printing program output via serial console is enabled. But editing examples within the hal examples is not the way forward. Time to setup a small rust project. For this the examples folder also contains most of what is required. The README does not contain a lot of information apart from what PINs are serial pins and links to two different development boards. But there are Cargo.toml, build.rs, .cargo/config.toml and memory.x files. The main hal crate directory also contains a rust-toolchain.toml file. Let’s take these pieces to put them together in a standlone project.

ch32-project
├── .cargo
│   └── config.toml       # configure default rust target and flashing commands
├── Cargo.lock
├── Cargo.toml            # project definiton and dependencies
├── README.md
├── build.rs              # build script to link using devices memory layout
├── memory.x              # memory layout and default handler config
├── rust-toolchain.toml   # configure default `nightly` compiler (embedded rusts ecosystem relies on nightly features)
├── src
│   └── main.rs

Some little adjustments are required for this to work. Within the Cargo.toml the reference to the devices hal needs to be adjusted from path to git dependency since obviously this code does not live within the hal crate anymore. If the hal is part of a larger crate workspace multiple such adjustments might be required. The main.rs is here is just the UART example from above. Since the ch32x devices use the RISCV instruction set there is no need to install a custom rust compiler or other shenanigans, also since this is baremetal code there are no system libraries that we would need to link against. So as long as we have installed the riscv32imc-unknown-none-elf target via rustup we are good to go and can compile the code using cargo build --release. On my first attempt I omitted the --release flag and ran into a linker error because the resulting binary was to big and could not fit into the flash size of the device. This can be solved by adding some optimizations to the dev profile in `Cargo.toml.

Now that we have a binary it is time to get it onto the dev board. Oh no instructions on how to accomplish this are thin, nothing can be found in the README. In cases like this it is worth looking into the .cargo/config.toml:

 1[build]
 2target = "riscv32imc-unknown-none-elf"
 3
 4[target."riscv32imc-unknown-none-elf"]
 5rustflags = [
 6#  "-C", "link-arg=-Tlink.x",
 7]
 8# runner = "riscv64-unknown-elf-gdb -q -x openocd.gdb"
 9# runner = "riscv-none-embed-gdb -q -x openocd.gdb"
10# runner = "gdb -q -x openocd.gdb"
11# runner = "wlink -v flash"
12
13runner = "wlink -v flash --enable-sdi-print --watch-serial --erase"
14# runner = "wlink -v flash"

There are multiple example runners configured with only one being active. Runners are commands used by cargo run to run the program after compilation has finished. Often times the developers of the hal change the runner command to the one that works for their setup and commenting the others out. In this case a program called wlink is used to flash to the target device. This is a rust program that will use the proprietary wlink usb debugger to flash the program to the device. When I started tinkering with this I did not yet possess a wlink debugger (more on that later). For reference the dev board I am using is the BluePill-Plus-CH32 which can be flashed via USB when put into flashing mode using the RST and BOOT. So how do I flash via USB? A little bit of searching and I found wchisp an open source reimplementation of official WCHIPSTOOL of the MCU manufacturer. It is marked as still work in progress but flashing my development board works like a charm.

At this point I have the code running on the device an can see the led flashing and have serial output when I attach an external serial to USB converter to the serial pins. At this point I could just start developing the project code and use print line debugging if required. But depending on the project that might be undesirable. And if your development device only has limited amounts of USB ports having an additional serial converter attached only to receive the debug output might seem excessive. In this case I want to go further and get actual state introspection capabilities. To do this a hardware debug adapter is required. It is a device that can be attached to a normal host system via USB and connected to an MCU via an debug interface. There are multiple debug interfaces found in the wild so depending on the device you need a compatible debug adapter in order for this to work. Common debug interfaces include jtag and in the ARM world swd, sadly neither of those are used by the ch32x family of devices. These devices expose a debug interface called SDI. I really did not find a lot of information about this interface and as far as I can tell the only debugger that can use this is the proprietary wlink device mentioned early. Now there might be more information out there, but there seem to be multiple protocols called SDI and the ones I found did not seem to fit what I am looking for, besides I want to get stuff done and not obsess about protocols. So now I do need to get my hands on one of them 🙄.

Using a proprietary debug adapter and a protocols that is not as common has some follow up challenges as well, because the support for this debug adapter is not up-streamed into common open source tools for on chip debugging, like openocd! The doc for probe.rs does mention wlink but I am more familiar with openocd. (Though exploring probe.rs is on my endless list of things I should look into). Anyway so for now I will stick to openocd + gdb. This is also the setup which is supported with MountRiverStudio the official development studio for wch based devices for C development.

To nicely integrate this into my development flow I opted to setup the tools in a containerized fashion allowing me to work on multiple machines avoiding manual toolchain setup annoyances and maybe even make it easier for other people to collaborate. So first I need to write a container file and install everything that we need:

 1FROM debian:bookworm
 2
 3RUN apt update -y && apt upgrade -y && \
 4    apt install git libjaylink-dev libusb-1.0-0 unzip curl libhidapi-hidraw0 xz-utils -y
 5
 6RUN cd /root && \
 7    curl -L -o mrs-toolchain.tar.xz "https://github.com/ch32-riscv-ug/MounRiver_Studio_Community_miror/releases/download/1.92-toolchain/MRS_Toolchain_Linux_x64_V1.92.tar.xz" && \
 8    mkdir mrs-toolchain && \
 9    tar -xvf mrs-toolchain.tar.xz -C mrs-toolchain --strip-components=1 && \
10    mv mrs-toolchain/OpenOCD/bin/openocd /usr/local/bin && \
11    mv mrs-toolchain/OpenOCD/share/openocd /usr/local/share && \
12    rm -rf mrs-toolchain mrs-toolchain.tar.xz && \
13    # Use up to date xpack toolchains for gdb
14    curl -L -o xpack-riscv-toolchain.tar.gz "https://github.com/xpack-dev-tools/riscv-none-elf-gcc-xpack/releases/download/v14.2.0-3/xpack-riscv-none-elf-gcc-14.2.0-3-linux-x64.tar.gz" && \
15    mkdir xpack-toolchain && \
16    tar -xvf xpack-riscv-toolchain.tar.gz -C xpack-toolchain --strip-components=1 && \
17    mv xpack-toolchain/bin/* /usr/local/bin && \
18    mv xpack-toolchain/lib/ /usr/local && \
19    mv xpack-toolchain/lib64/ /usr/local && \
20    mv xpack-toolchain/libexec /usr/local && \
21    mv xpack-toolchain/riscv-none-elf /usr/local && \
22    rm -rf xpack-toolchain xpack-riscv-toolchain.tar.gz
23
24RUN mkdir -p /root/.config/gdb && echo "set auto-load safe-path /" >> /root/.config/gdb/gdbinit
25
26ENTRYPOINT [ "/usr/bin/bash" ]

There are multiple things going on here:

I need an openocd with support for the wlink debugger and its SDI debug protocol and while I did find some forks of openocd that claim to have support I could not get them to compile correctly. So instead I opted for installing the binary version from MountRiverStudio directly. Luckily there is a community organization on github, that provides tarball downloads for the tools directly.
I need an RISCV compatible gdb. The MountRiverStudio tools do include gdb but its version is below 14 which is the minimum required version of gdb for DAP support. DAP (Debug Adapter Protocol) is a close cousin to LSP (Language Server Protocol) and enable editor integration of debugging capabilities. It is a lot more comfortable to set breakpoints in my editor and step through the code instead of writing gdb scripts and running them from my terminal. So instead I opted to install the xpack version of the RISCV development tools, which includes gdb in a more recent version. The command on line 24 enables gdb to run .gdbinit scripts on startup, since gdb is running in a container and will only have access to the parts of the system exposed to it, I do not see a lot of potential problems with enabling this feature.

Next I need to be able to run these tools comfortably within their containers from my shell or my editor. Enter some wrapper scripts (A small shoutout to an awesome friend of mine how introduced me to this approach and showing me that bash does have its uses 😉):

ch32-project/bin
├── build-wch-tools-container.sh
├── gdb
└── openocd

These scripts live within a bin directory within my repository. This allows me to do export PATH=$PWD/bin:$PATH from my project directory before opening my editor. This way these scripts will be used instead of the openocd or gdb installed on my system.

 1#!/usr/bin/env bash
 2
 3set -euo pipefail
 4
 5CONTAINER_NAME="localhost/wch-dev-tools:latest"
 6CONTAINER_TOOLS_BASEDIR="$(dirname "$(readlink -f "$0")")"
 7
 8pushd "$CONTAINER_TOOLS_BASEDIR"
 9podman build -t "$CONTAINER_NAME" -f "../wch-tools.Containerfile" .
10popd

The first script is just used to build the container initially and give it a name, this way I do not need to host the container on docker hub, though I still need to pull the base container from there initially. I like to use podman instead of docker. Though the process would probably look exactly the same using docker instead. The more intersting part is the wrapper script used to run the containerized tools.

 1#!/usr/bin/env bash
 2
 3set -euo pipefail
 4
 5CONTAINER_IMAGE="localhost/wch-dev-tools:latest"
 6CONTAINER_TOOLS_BASEDIR="$(dirname "$(readlink -f "$0")")"
 7
 8function _fatal {
 9	echo -e "\e[31mERROR\e[0m   $(</dev/stdin)$*" 1>&2
10	exit 1
11}
12
13declare -a PODMAN_ARGS=(
14	"--rm" "-i" "--log-driver=none"
15    "--network=host"
16	"-v" "$PWD:$PWD:rw"
17	"-w" "$PWD"
18)
19
20for device in /dev/bus/usb/*/*; do
21    if udevadm info "$device" | grep -q "ID_VENDOR=wch.cn" && \
22       udevadm info "$device" | grep -q "ID_MODEL=WCH-Link"; then
23        DEBUGGER_DEV_PATH="$device"
24        break
25    fi
26done
27
28if [[ -z "${DEBUGGER_DEV_PATH:-}" ]]; then
29    echo "Could not find hardware debugger … Exiting!" 1>&2
30    exit 1
31else
32    # add jlink to podman device
33    PODMAN_ARGS+=("--device=$DEBUGGER_DEV_PATH")
34fi
35
36[[ -t 1 ]] && PODMAN_ARGS+=("-t")
37
38if ! podman image exists "$CONTAINER_IMAGE"; then
39    #attempt to build container
40    "$CONTAINER_TOOLS_BASEDIR/build-wch-tools-container.sh" 1>&2 ||
41        _fatal "faild to build local image, cannot continue! … please ensure you have an internet connection"
42fi
43
44podman run "${PODMAN_ARGS[@]}" --entrypoint openocd "$CONTAINER_IMAGE" "$@"

The script above is the wrapper script used to run openocd. Most of what this script does in gather the arguments to pass to podman in a variable, then check if the container already exists. If it does not exist it will be built using the previously discussed script. Then podman is invoked with all the required arguments starting openocd as entrypoint and passing any command line arguments along. The interesting part here is what arguments podman receives. In line 13-18 the PODMAN_ARGS variable is declared ind filled with the default parameters:

Specifying –network=host lets the container run without a network namespace enabling the container to communicate with other services running on the host. Though for openocd it would have been possible to just passing along the ports that openocd opens for gdb to connect to using –port. So this option is a bit overkill in this case.
The project directory will be mounted within the container, it is assumed the command is invoked from the project root directory as opposed to some other directory. Also the working directory is set to it so running openocd would behave the same like running it from the host system directly.

The tricky part is finding the wlink debugger device itself passing access to it to the container. This happens in line 20-34, using udevadm info the device properties of all attached usb devices are checked for the corresponding vendor and model id of the debug adapter. If no debugger device is found there is no point in continuing so in that case the script exits with an error. When using podman it is necessary to have setup udev rules giving your user access to the device, otherwise openocd will not be able to use the device and will throw a permission error. This would also be required when not containerizing the setup unless you explicitly run openocd as root.

 1#!/usr/bin/env bash
 2
 3set -euo pipefail
 4
 5CONTAINER_IMAGE="localhost/wch-dev-tools:latest"
 6CONTAINER_TOOLS_BASEDIR="$(dirname "$(readlink -f "$0")")"
 7
 8function _fatal {
 9	echo -e "\e[31mERROR\e[0m   $(</dev/stdin)$*" 1>&2
10	exit 1
11}
12
13declare -a PODMAN_ARGS=(
14	"--rm" "-i" "--log-driver=none"
15    "--network=host"
16    "--pid=host"
17	"-v" "$PWD:$PWD:rw"
18	"-w" "$PWD"
19)
20
21[[ -t 1 ]] && PODMAN_ARGS+=("-t")
22
23if ! podman image exists "$CONTAINER_IMAGE"; then
24    #attempt to build container
25    "$CONTAINER_TOOLS_BASEDIR/build-wch-tools-container.sh" 1>&2 ||
26        _fatal "faild to build local image, cannot continue! … please ensure you have an internet connection"
27fi
28
29podman run "${PODMAN_ARGS[@]}" --entrypoint riscv-none-elf-gdb-py3 "$CONTAINER_IMAGE" "$@"

The wrapper script for gdb looks similar but is actually less complicated because all that really needs to happen is to start the correct version of gdb. In this case the --network=host options is mandatory since otherwise the gdb container would not be able to connect to the openocd session. Apart from that I added the --pid=host option to disable process id isolation in the container, the reason being that protocols like DAP or LSP tend to communicate process ids and some server implementations do not like it if the communicated process id does not exist. The reason for using riscv-none-elf-gdb-py3 is that the DAP support of gdb is implemented in python, so versions without python support enabled can not be used when using DAP.

Now that the wrapper scripts can be transparently be used instead of the native tools the next step is to configure the tools to work together as expected. For this an openocd and gdb config file need to be written.

set _CHIPNAME ch32v203
set _TARGETNAME $_CHIPNAME.cpu

#bindto 0.0.0.0

adapter driver wlinke
adapter speed 6000
transport select sdi

sdi newtap $_CHIPNAME cpu -irlen 5 --expected-id 0x00001
target create $_TARGETNAME.0 wch_riscv -chain-position $_TARGETNAME
$_TARGETNAME.0 configure -work-area-phys 0x20000000 -work-area-size 10000 -work-area-backup 1
set _FLASHNAME $_CHIPNAME.flash

flash bank $_FLASHNAME wch_rsicv 0x00000000 0 0 0 $_TARGETNAME.0

init

I cobbled this together from the openocd documentation and wch examples found in the MountRiverStudio distribution of openocd. An important note here is that if you are not using --network=host when spawning the container the bindto instruction needs to be uncommented. The reason being that if the container is running in its own namespace then openocd listing on localhost within the network namespace will mean that it will not be reachable from the host.

target extended-remote :3333
set remotetimeout 2000

file target/riscv32imc-unknown-none-elf/release/ch32-project

monitor reset halt

With the .gdbinit file in place starting openocd and afterwards gdb works as expected. The last step is to configure my editor to start gdb when initiating debugging. I will not configure it to start openocd before hand, mostly because it is not worth the extra hassle. And just manually starting openocd is not that big of a deal to me.

I am using doom emacs, which sets up dape as the plugin for DAP support. It brings withit a lot of pre-configured debugging targets, but for using gdb with rust and connecting to openocd a custom configuration in required.

 1(after! dape
 2        (add-to-list
 3        'dape-configs
 4        `(gdb-dap-openocd
 5        ensure (lambda (config)
 6                (dape-ensure-command config)
 7                (let* ((default-directory
 8                        (or (dape-config-get config 'command-cwd)
 9                                default-directory))
10                        (command (dape-config-get config 'command))
11                        (output (shell-command-to-string (format "%s --version" command)))
12                        (version (save-match-data
13                                        (when (string-match "GNU gdb \\(?:(.*) \\)?\\([0-9.]+\\)" output)
14                                        (string-to-number (match-string 1 output))))))
15                        (unless (>= version 14.1)
16                        (user-error "Requires gdb version >= 14.1"))))
17        modes ()
18        command-cwd dape-command-cwd
19        command "gdb"
20        command-args ("--interpreter=dap")
21        :request nil
22        :program nil
23        :args []
24        :stopAtBeginningOfMainSubprogram nil))
25)

This is quit unspectacular, it is mostly just copied from the normal dape configuration for gdb. But setting all default arguments to nil. The reason being that when using the :program arg, dape would run into an error terminating the gdb session. Just setting all options to none a let gdbinit handle loading symbols and connection to openocd just works.

And that is it, now when I set a breakpoint in my editor and start a debugger session i can step through the code and introspect the state instruction for instruction. I still have to flash the program before using this but that is not a big hurdle to overcome.

Anyway I hope you enjoyed this little post. If you have thoughts/questions feel free to reach out to me. (Note: I should probably delete the link to my X Account because I am no longer using it … Though I never really got into X, formally twitter anyway)

I hope to find the time to write some more little posts on embedded rust programming as this project continues. Happy hacking 😉

In this article, I want to show how to migrate an existing Linux server to NixOS — in my case the CoreOS/Flatcar Linux installation on my Network Attached Storage (NAS) PC.

I will show in detail how the previous CoreOS setup looked like (lots of systemd units starting Docker containers), how I migrated it into an intermediate state (using Docker on NixOS) just to get things going, and finally how I migrated all units from Docker to native NixOS modules step-by-step.

If you haven’t heard of NixOS, I recommend you read the first page of the NixOS website to understand what NixOS is and what sort of things it makes possible.

The target audience of this blog post is people interested in trying out NixOS for the use-case of a NAS, who like seeing examples to understand how to configure a system.

You can apply these examples by first following my blog post “How I like to install NixOS (declaratively)”, then making your way through the sections that interest you. If you prefer seeing the full configuration, skip to the conclusion.

Context/History

Over the last decade, I used a number of different operating systems for my NAS needs. Here’s an overview of the 2 NAS systems storage2 and storage3:

Year	storage2	storage3	Details (blog post)
2013	Debian on qnap	Debian on qnap	Wake-On-LAN with Debian on a qnap TS-119P2+
2016	CoreOS on PC	CoreOS on PC	Gigabit NAS (running CoreOS)
2023	CoreOS on PC	Ubuntu+ZFS on PC	My all-flash ZFS NAS build
2025	NixOS on PC	Ubuntu+ZFS on PC	→ you are here ←
?	NixOS on PC	NixOS+ZFS on PC	Converting more PCs to NixOS seems inevitable ;)

My NAS Software Requirements

(This post is only about software! For my usage patterns and requirements regarding hardware selection, see “Design Goals” in my My all-flash ZFS NAS build post (2023).)
Remote management: I really like the model of having the configuration of my network storage builds version-controlled and managed on my main PC. It’s a nice property that I can regain access to my backup setup by re-installing my NAS from my PC within minutes.
Automated updates, with easy rollback: Updating all my installations manually is not my idea of a good time. Hence, automated updates are a must — but when the update breaks, a quick and easy path to recovery is also a must.
- CoreOS/Flatcar achieved that with an A/B updating scheme (update failed? boot the old partition), whereas NixOS achieves that with its concept of a “generation” (update failed? select the old generation), which is finer-grained.

Why migrate from CoreOS/Flatcar to NixOS?

When I started using CoreOS, Docker was pretty new technology. I liked that using Docker containers allowed you to treat services uniformly — ultimately, they all expose a port of some sort (speaking HTTP, or Postgres, or…), so you got the flexibility to run much more recent versions of software on a stable OS, or older versions in case an update broke something.

Over a decade later, Docker is established tech. People nowadays take for granted the various benefits of the container approach.

So, here’s my list of reasons why I wasn’t satisfied with Flatcar Linux anymore.

R1. cloud-init is deprecated

The CoreOS cloud-init project was deprecated at some point in favor of Ignition, which is clearly more powerful, but also more cumbersome to get started with as a hobbyist. As far as I can tell, I must host my config at some URL that I then provide via a kernel parameter. The old way of just copying a file seems to no longer be supported.

Ignition also seems less convenient in other ways: YAML is no longer supported, only JSON, which I don’t enjoy writing by hand. Also, the format seems to change quite a bit.

As a result, I never made the jump from cloud-init to Ignition, and it’s not good to be reliant on a long-deprecated way to use your OS of choice.

R2. Container Bitrot

At some point, I did an audit of all my containers on the Docker Hub and noticed that most of them were quite outdated. For a while, Docker Hub offered automated builds based on a Dockerfile obtained from GitHub. However, automated builds now require a subscription, and I will not accept a subscription just to use my own computers.

R3. Dependency on a central service

If Docker at some point ceases operation of the Docker Hub, I am unable to deploy software to my NAS. This isn’t a very hypothetical concern: In 2023, Docker Hub announced the end of organizations on the Free tier and then backpedaled after community backlash.

Who knows how long they can still provide free services to hobbyists like myself.

R4. Could not try Immich on Flatcar

The final nail in the coffin was when I noticed that I could not try Immich on my NAS system! Modern web applications like Immich need multiple Docker containers (for Postgres, Redis, etc.) and hence only offer Docker Compose as a supported way of installation.

Unfortunately, Flatcar does not include Docker Compose.

I was not in the mood to re-package Immich for non-Docker-Compose systems on an ongoing basis, so I decided that a system on which I can neither run software like Immich directly, nor even run Docker Compose, is not sufficient for my needs anymore.

Reason Summary

With all of the above reasons, I would have had to set up automated container builds, run my own central registry and would still be unable to run well-known Open Source software like Immich.

Instead, I decided to try NixOS again (after a 10 year break) because it seems like the most popular declarative solution nowadays, with a large community and large selection of packages.

How does NixOS compare for my situation?

Same: I also need to set up an automated job to update my NixOS systems.
- I already have such a job for updating my gokrazy devices.
- Docker push is asynchronous: After a successful push, I still need extra automation for pulling the updated containers on the target host and restarting the affected services, whereas NixOS includes all of that.
Better: There is no central registry. With NixOS, I can push the build result directly to the target host via SSH.
Better: The corpus of available software in NixOS is much larger (including Immich, for example) and the NixOS modules generally seem to be expressed at a higher level of abstraction than individual Docker containers, meaning you can configure more features with fewer lines of config.

Prototyping in a VM

My NAS setup needs to work every day, so I wanted to prototype my desired configuration in a VM before making changes to my system. This is not only safer, it also allows me to discover any roadblocks, and what working with NixOS feels like without making any commitments.

I copied my NixOS configuration from a previous test installation (see “How I like to install NixOS (declaratively)”) and used the following command to build a VM image and start it in QEMU:

nix build .#nixosConfigurations.storage2.config.system.build.vm

export QEMU_NET_OPTS=hostfwd=tcp::2222-:22
export QEMU_KERNEL_PARAMS=console=ttyS0
./result/bin/run-nixplay-vm

The configuration instructions below can be tried out in this VM, and once you’re happy enough with what you have, you can repeat the steps on the actual machine to migrate.

Migrating

For the migration of my actual system, I defined the following milestones that should be achievable within a typical session of about an hour (after prototyping them in a VM):

M1. Install NixOS
M2. Set up remote disk unlock
M3. Set up Samba for access
M4. Set up SSH/rsync for backups
Everything extra is nice-to-have and could be deferred to a future session on another day.

In practice, this worked out exactly as planned: the actual installation of NixOS and setting up my config to milestone M4 took a little over one hour. All the other nice-to-haves were done over the following days and weeks as time permitted.

Tip: After losing data due to an installer bug in the 2000s, I have adopted the habit of physically disconnecting all data disks (= pulling out the SATA cable) when re-installing the system disk.

M1. Install NixOS

After following “How I like to install NixOS (declaratively)”, this is my initial configuration.nix:

{ modulesPath, lib, pkgs, ... }:

{
  imports =
    [
      (modulesPath + "/installer/scan/not-detected.nix")
      ./hardware-configuration.nix
      ./disk-config.nix
    ];

  # Adding michael as trusted user means
  # we can upgrade the system via SSH (see Makefile).
  nix.settings.trusted-users = [ "michael" "root" ];
  # Clean the Nix store every week.
  nix.gc = {
    automatic = true;
    dates = "weekly";
    options = "--delete-older-than 7d";
  };

  boot.loader.systemd-boot = {
    enable = true;
    configurationLimit = 10;
  };
  boot.loader.efi.canTouchEfiVariables = true;

  networking.hostName = "storage2";
  time.timeZone = "Europe/Zurich";

  # Use systemd for networking
  services.resolved.enable = true;
  networking.useDHCP = false;
  systemd.network.enable = true;

  systemd.network.networks."10-e" = {
    matchConfig.Name = "e*";  # enp9s0 (10G) or enp8s0 (1G)
    networkConfig = {
      IPv6AcceptRA = true;
      DHCP = "yes";
    };
  };

  i18n.supportedLocales = [
    "en_DK.UTF-8/UTF-8"
    "de_DE.UTF-8/UTF-8"
    "de_CH.UTF-8/UTF-8"
    "en_US.UTF-8/UTF-8"
  ];
  i18n.defaultLocale = "en_US.UTF-8";

  users.mutableUsers = false;
  security.sudo.wheelNeedsPassword = false;
  users.users.michael = {
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5secret"
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5key"
    ];

    isNormalUser = true;
    description = "Michael Stapelberg";
    extraGroups = [ "networkmanager" "wheel" ];
    initialPassword = "secret";  # XXX: change!
    shell = pkgs.zsh;
    packages = with pkgs; [];
  };

  environment.systemPackages = with pkgs; [
    git  # for checking out github.com/stapelberg/configfiles
    rsync
    zsh
    vim
    emacs
    wget
    curl
  ];

  programs.zsh.enable = true;

  services.openssh.enable = true;

  # This value determines the NixOS release from which the default
  # settings for stateful data, like file locations and database versions
  # on your system were taken. It‘s perfectly fine and recommended to leave
  # this value at the release version of the first install of this system.
  # Before changing this value read the documentation for this option
  # (e.g. man configuration.nix or on https://nixos.org/nixos/options.html).
  system.stateVersion = "25.05"; # Did you read the comment?
}

All following sections describe changes within this configuration.nix.

All devices in my home network obtain their IP address via DHCP. If I want to make an IP address static, I configure it accordingly on my router.

My NAS PCs have one specialty with regards to IP addressing: They are reachable via IPv4 and IPv6, and the IPv6 address can be derived from the IPv4 address.

Hence, I changed the systemd-networkd configuration from above such that it configures a static IPv6 address in a dynamically configured IPv6 network:

  systemd.network.networks."10-e" = {
    matchConfig.Name = "e*";  # enp9s0 (10G) or enp8s0 (1G)
    networkConfig = {
      IPv6AcceptRA = true;
      DHCP = "yes";
    };
    ipv6AcceptRAConfig = {
      Token = "::10:0:0:252";
    };
  };

✅ This fulfills milestone M1.

M2. Set up remote disk unlock

To unlock my encrypted disks on boot, I have a custom systemd service unit that uses wget(1) and cryptsetup(8) to split the key file between the NAS and a remote server (= an attacker needs both pieces to unlock).

With CoreOS/Flatcar, my cloud-init configuration looked as follows:

coreos:
  units:
    - name: unlock.service
      command: start
      content: |
        [Unit]
        Description=unlock hard drive
        Wants=network.target
        After=systemd-networkd-wait-online.service
        Before=samba.service

        [Service]
        Type=oneshot
        RemainAfterExit=yes
        # Wait until the host is actually reachable.
        ExecStart=/bin/sh -c "c=0; while [ $c -lt 5 ]; do /bin/ping6 -n -c 1 r.zekjur.net && break; c=$((c+1)); sleep 1; done"
        ExecStart=/bin/sh -c "[ -e \"/dev/mapper/S5SSNF0T205183F_crypt\" ] || (echo -n my_local_secret && wget --retry-connrefused --ca-directory=/dev/null --ca-certificate=/etc/ssl/certs/r.zekjur.net.crt -qO - https://r.zekjur.net:8443/nascrypto) | /sbin/cryptsetup --key-file=- luksOpen /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0T205183F S5SSNF0T205183F_crypt"
        ExecStart=/bin/sh -c "[ -e \"/dev/mapper/S5SSNJ0T205991B_crypt\" ] || (echo -n my_local_secret && wget --retry-connrefused --ca-directory=/dev/null --ca-certificate=/etc/ssl/certs/r.zekjur.net.crt -qO - https://r.zekjur.net:8443/nascrypto) | /sbin/cryptsetup --key-file=- luksOpen /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNJ0T205991B S5SSNJ0T205991B_crypt"
        ExecStart=/bin/sh -c "vgchange -ay"
        ExecStart=/bin/mount /dev/mapper/data-data /srv

write_files:
  - path: /etc/ssl/certs/r.zekjur.net.crt
    content: |
      -----BEGIN CERTIFICATE-----
      MIID8TCCAlmgAwIBAgIRAPWwvYWpoH+lGKv6rxZvC4MwDQYJKoZIhvcNAQELBQAw
      […]
      -----END CERTIFICATE-----

I converted it into the following NixOS configuration:

  systemd.services.unlock = {
    wantedBy = [ "multi-user.target" ];
    description = "unlock hard drive";
    wants = [ "network.target" ];
    after = [ "systemd-networkd-wait-online.service" ];
    serviceConfig = {
      Type = "oneshot";
      RemainAfterExit = "yes";
      ExecStart = [
        # Wait until the host is actually reachable.
        ''/bin/sh -c "c=0; while [ $c -lt 5 ]; do ${pkgs.iputils}/bin/ping -n -c 1 r.zekjur.net && break; c=$((c+1)); sleep 1; done"''

        ''/bin/sh -c "[ -e \"/dev/mapper/S5SSNF0T205183F_crypt\" ] || (echo -n my_local_secret && ${pkgs.wget}/bin/wget --retry-connrefused --ca-directory=/dev/null --ca-certificate=/etc/ssl/certs/r.zekjur.net.crt -qO - https://r.zekjur.net:8443/sdb2_crypt) | ${pkgs.cryptsetup}/bin/cryptsetup --key-file=- luksOpen /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0T205183F S5SSNF0T205183F_crypt"''

        ''/bin/sh -c "[ -e \"/dev/mapper/S5SSNJ0T205991B_crypt\" ] || (echo -n my_local_secret && ${pkgs.wget}/bin/wget --retry-connrefused --ca-directory=/dev/null --ca-certificate=/etc/ssl/certs/r.zekjur.net.crt -qO - https://r.zekjur.net:8443/sdc2_crypt) | ${pkgs.cryptsetup}/bin/cryptsetup --key-file=- luksOpen /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNJ0T205991B S5SSNJ0T205991B_crypt"''

        ''/bin/sh -c "${pkgs.lvm2}/bin/vgchange -ay"''
        ''/run/wrappers/bin/mount /dev/mapper/data-data /srv''
      ];
    };

  };

We’ll also need to store the custom TLS certificate file on disk. For that, we can use the environment. configuration:

  environment.etc."ssl/certs/r.zekjur.net.crt".text = ''
-----BEGIN CERTIFICATE-----
MIID8TCCAlmgAwIBAgIRAPWwvYWpoH+lGKv6rxZvC4MwDQYJKoZIhvcNAQELBQAw
[…]
-----END CERTIFICATE-----
'';

The references like ${pkgs.wget} will be replaced with a path to the Nix store (→ nix.dev documentation). On CoreOS/Flatcar, I was limited to using just the (minimal set of) software included in the base image, or I had to reach for Docker. On NixOS, we can use all packages available in nixpkgs.

After deploying and rebooting, I can access my unlocked disk under /srv! 🎉

% df -h /srv
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/data-data   15T   14T  342G  98% /srv

When listing my files, I noticed that the group id was different between my old system and the new system. This can be fixed by explicitly specifying the desired group id:

  users.groups.michael = {
    gid = 1000;  # for consistency with storage3
  };

✅ M2 is complete.

M3. Set up Samba for access

Whereas I want to configure remote disk unlock at the systemd service level, for Samba I want to use Docker: I wanted to first transfer my old (working) Docker-based setups as they are, and only later convert them to Nix.

We enable the Docker NixOS module which sets up the daemons that Docker needs and whatever else is needed to make it work:

  virtualisation.docker.enable = true;

This is already sufficient for other services to use Docker, but I also want to be able to run the docker command interactively for debugging. Therefore, I added docker to systemPackages:

  environment.systemPackages = with pkgs; [
    git  # for checking out github.com/stapelberg/configfiles
    rsync
    zsh
    vim
    emacs
    wget
    curl
    docker
  ];

After deploying this configuration, I can run docker run -ti debian to verify things work.

The cloud-init version of samba looked like this:

coreos:
  units:
    - name: samba.service
      command: start
      content: |
        [Unit]
        Description=samba server
        After=docker.service unlock.mount
        Requires=docker.service unlock.mount

        [Service]
        Restart=always
        StartLimitInterval=0

        # Always pull the latest version (bleeding edge).
        ExecStartPre=-/usr/bin/docker pull stapelberg/docker-samba:latest

        # Set up samba users (cannot be done in the (public) Dockerfile because
        # users/passwords are sensitive information).
        ExecStartPre=-/usr/bin/docker kill smb
        ExecStartPre=-/usr/bin/docker rm smb
        ExecStartPre=-/usr/bin/docker rm smb-prep
        ExecStartPre=/usr/bin/docker run --name smb-prep stapelberg/docker-samba sh -c 'adduser --quiet --disabled-password --gecos "" --uid 29901 michael && sed -i "s,\\[global\\],[global]\\nserver multi channel support = yes\\naio read size = 1\\naio write size = 1,g" /etc/samba/smb.conf'
        ExecStartPre=/usr/bin/docker commit smb-prep smb-prepared
        ExecStartPre=/usr/bin/docker rm smb-prep
        ExecStartPre=/usr/bin/docker run --name smb-prep smb-prepared /bin/sh -c "echo \"secret\nsecret\n" | tee - | smbpasswd -a -s michael"
        ExecStartPre=/usr/bin/docker commit smb-prep smb-prepared

        ExecStart=/usr/bin/docker run \
          -p 137:137 \
          -p 138:138 \
          -p 139:139 \
          -p 445:445 \
          --tmpfs=/run \
          -v /srv/data:/srv/data \
          --name smb \
          -t \
          smb-prepared \
            /usr/sbin/smbd --foreground --debug-stdout --no-process-group

We can translate this 1:1 to NixOS:

  systemd.services.samba = {
    wantedBy = [ "multi-user.target" ];
    description = "samba server";
    after = [ "unlock.service" ];
    requires = [ "unlock.service" ];
    serviceConfig = {
      Restart = "always";
      StartLimitInterval = 0;
      ExecStartPre = [
        # Always pull the latest version.
        ''-${pkgs.docker}/bin/docker pull stapelberg/docker-samba:latest''

        # Set up samba users (cannot be done in the (public) Dockerfile because
        # users/passwords are sensitive information).
        ''-${pkgs.docker}/bin/docker kill smb''
        ''-${pkgs.docker}/bin/docker rm smb''
        ''-${pkgs.docker}/bin/docker rm smb-prep''
        ''-${pkgs.docker}/bin/docker run --name smb-prep stapelberg/docker-samba sh -c 'adduser --quiet --disabled-password --gecos "" --uid 29901 michael && sed -i "s,\\[global\\],[global]\\nserver multi channel support = yes\\naio read size = 1\\naio write size = 1,g" /etc/samba/smb.conf' ''
        ''-${pkgs.docker}/bin/docker commit smb-prep smb-prepared''
        ''-${pkgs.docker}/bin/docker rm smb-prep''
        ''-${pkgs.docker}/bin/docker run --name smb-prep smb-prepared /bin/sh -c "echo \"secret\nsecret\n" | tee - | smbpasswd -a -s michael"''
        ''-${pkgs.docker}/bin/docker commit smb-prep smb-prepared''
      ];

      ExecStart = ''-${pkgs.docker}/bin/docker run \
           -p 137:137 \
           -p 138:138 \
           -p 139:139 \
           -p 445:445 \
           --tmpfs=/run \
           -v /srv/data:/srv/data \
           --name smb \
           -t \
           smb-prepared \
             /usr/sbin/smbd --foreground --debug-stdout --no-process-group
             '';
    };
  };
}

✅ Now I can manage my files over the network, which completes M3!

M4. Set up SSH/rsync for backups

For backing up data, I use rsync over SSH. I restrict this SSH access to run only rsync commands by using rrsync (in a Docker container). To configure the SSH authorized_keys(5) , we set:

  users.users.root.openssh.authorizedKeys.keys = [
    ''command="${pkgs.docker}/bin/docker run --log-driver none -i -e SSH_ORIGINAL_COMMAND -v /srv/backup/midna:/srv/backup/midna stapelberg/docker-rsync /srv/backup/midna" ssh-rsa AAAAB3Npublickey root@midna''
  };

✅ A successful test backup run completes milestone M4!

Nice-to-haves

N1. Prometheus Node Exporter

I like to monitor all my machines with Prometheus (and Grafana). For network connectivity and authentication, I use the Tailscale mesh VPN.

To install Tailscale, I enable its NixOS module and make the tailscale command available:

  services.tailscale.enable = true;
  environment.systemPackages = with pkgs; [ tailscale ];

After deploying, I run sudo tailscale up and open the login link in my browser.

The Prometheus Node Exporter can also easily be enabled through its NixOS module:

  services.prometheus.exporters.node = {
    enable = true;
    listenAddress = "storage2.example.ts.net";
  };

However, this isn’t reliable yet: When Tailscale’s startup takes a while during system boot, the Node Exporter might burn through its entire restart budget when it cannot listen on the Tailscale IP address yet. We can enable indefinite restarts for the service to eventually come up:

  systemd.services."prometheus-node-exporter" = {
    startLimitIntervalSec = 0;
    serviceConfig = {
      Restart = "always";
      RestartSec = 1;
    };
  };

N2. Reliable mounting

While migrating my setup, I noticed that calling mount(8) from unlock.service directly is not reliable, and it’s better to let systemd manage the mounting:

  fileSystems."/srv" = {
    device = "/dev/mapper/data-data";
    fsType = "ext4";
    options = [
      "nofail"
      "x-systemd.requires=unlock.service"
    ];
  };

Afterwards, I could just remove the mount(8) call from unlock.service:

@@ -247,7 +247,10 @@ fry/U6A=
         ''/bin/sh -c "${pkgs.lvm2.bin}/bin/vgchange -ay"''
-        ''/run/wrappers/bin/mount /dev/mapper/data-data /srv''
+        # Let systemd mount /srv based on the fileSystems./srv
+        # declaration to prevent race conditions: mount
+        # might not succeed while the fsck is still in progress,
+        # for example, which otherwise makes unlock.service fail.
       ];
     };

In systemd services, I can now depend on the /srv mount unit:

  systemd.services.jellyfin = {
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
  };

N3. nginx-healthz

To save power, I turn off my NAS when they are not in use.

My backup orchestration uses Wake-on-LAN to start a wakeup and needs to wait until the NAS is fully booted up and has mounted its /srv mount before it can start backup jobs.

For this purpose, I have configured a web server (without any files) that depends on the /srv mount. So, once the web server responds to HTTP requests, we know /srv is mounted.

The cloud-init config looked as follows:

coreos:
  units:
    - name: healthz.service
      command: start
      content: |
        [Unit]
        Description=nginx for /srv health check
        Wants=network.target
        After=srv.mount
        Requires=srv.mount
        StartLimitInterval=0

        [Service]
        Restart=always
        ExecStartPre=/bin/sh -c 'systemctl is-active docker.service'
        ExecStartPre=/usr/bin/docker pull nginx:1
        ExecStartPre=-/usr/bin/docker kill nginx-healthz
        ExecStartPre=-/usr/bin/docker rm -f nginx-healthz
        ExecStart=/usr/bin/docker run \
            --name nginx-healthz \
            --publish 10.0.0.252:8200:80 \
            --log-driver=journald \
            nginx:1

The Docker version (ported from Flatcar Linux) looks like this:

  systemd.services.healthz = {
    description = "nginx for /srv health check";
    wants = [ "network.target" ];
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
    startLimitIntervalSec = 0;
    serviceConfig = {
      Restart = "always";
      ExecStartPre = [
        ''/bin/sh -c 'systemctl is-active docker.service' ''
        ''-${pkgs.docker}/bin/docker pull nginx:1''
        ''-${pkgs.docker}/bin/docker kill nginx-healthz''
        ''-${pkgs.docker}/bin/docker rm -f nginx-healthz''
      ];

      ExecStart = [
        ''-${pkgs.docker}/bin/docker run \
            --name nginx-healthz \
            --publish 10.0.0.252:8200:80 \
            --log-driver=journald \
            nginx:1
        ''
      ];
    };
  };

This configuration gets a lot simpler when migrating it from Docker to NixOS:

  # Signal readiness on HTTP port 8200 once /srv is mounted:
  networking.firewall.allowedTCPPorts = [ 8200 ];
  services.caddy = {
    enable = true;
    virtualHosts."http://10.0.0.252:8200".extraConfig = ''
      respond "ok"
    '';
  };
  systemd.services.caddy = {
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
  };

N4. NixOS Jellyfin

The Docker version (ported from Flatcar Linux) looks like this:

  networking.firewall.allowedTCPPorts = [ 4414 8096 ];
  systemd.services.jellyfin = {
    wantedBy = [ "multi-user.target" ];
    description = "jellyfin";
    after = [ "docker.service" "srv.mount" ];
    requires = [ "docker.service" "srv.mount" ];
    startLimitIntervalSec = 0;
    serviceConfig = {
      Restart = "always";
      ExecStartPre = [
        ''-${pkgs.docker}/bin/docker pull lscr.io/linuxserver/jellyfin:latest''
        ''-${pkgs.docker}/bin/docker rm jellyfin''
      ];
      ExecStart = [
        ''-${pkgs.docker}/bin/docker run \
          --rm \
          --net=host \
          --name=jellyfin \
          -e TZ=Europe/Zurich \
          -v /srv/jellyfin/config:/config \
          -v /srv/data/movies:/data/movies:ro \
          -v /srv/data/series:/data/series:ro \
          -v /srv/data/mp3:/data/mp3:ro \
          lscr.io/linuxserver/jellyfin:latest
        ''
      ];
    };
  };

As before, when using jellyfin from NixOS, the configuration gets simpler:

  services.jellyfin = {
    enable = true;
    openFirewall = true;
  };
  systemd.services.jellyfin = {
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
  };

For a while, I had also set up compatibility symlinks that map the old location (/data/movies, inside the Docker container) to the new location (/srv/data/movies), but I encountered strange issues in Jellyfin and ended up just re-initializing my whole Jellyfin state. While the required configuration had more lines, I found it neat to move it into its own file, so here is how to do that:

Remove the lines above from configuration.nix and move them into jellyfin.nix:

{
  config,
  lib,
  pkgs,
  modulesPath,
  ...
}:

{
  services.jellyfin = {
    enable = true;
    openFirewall = true;
    dataDir = "/srv/jellyfin";
    cacheDir = "/srv/jellyfin/config/cache";
  };
  systemd.services.jellyfin = {
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
  };
}

Then, in configuration.nix, add jellyfin.nix to the imports:

   imports = [
     ./hardware-configuration.nix
     ./jellyfin.nix
   ];

N5. NixOS samba

To use Samba from NixOS, I replaced my systemd.services.samba config from M3 with this:

  services.samba = {
    enable = true;
    openFirewall = true;
    settings = {
      "global" = {
        "map to guest" = "bad user";
      };
      "data" = {
        "path" = "/srv/data";
        "comment" = "public data";
        "read only" = "no";
        "create mask" = "0775";
        "directory mask" = "0775";
        "guest ok" = "yes";
      };
    };
  };
  system.activationScripts.samba_user_create = ''
      smb_password="secret"
      echo -e "$smb_password\n$smb_password\n" | ${lib.getExe' pkgs.samba "smbpasswd"} -a -s michael
    '';

Note: Setting the samba password in the activation script works for small setups, but if you want to keep your samba passwords out of the Nix store, you’ll need to use a different approach. On a different machine, I use sops-nix to manage secrets and found that refactoring the smbpasswd call like so works reliably:

let
  setPasswords = pkgs.writeShellScript "samba-set-passwords" ''
    set -euo pipefail
    for user in michael; do
        smb_password="$(cat /run/secrets/samba_passwords/$user)"
        echo -e "$smb_password\n$smb_password\n" | ${lib.getExe' pkgs.samba "smbpasswd"} -a -s $user
    done
  '';
in
 {
  # …
  services.samba = {
    # …as above…
  }

  systemd.services.samba-smbd.serviceConfig.ExecStartPre = [
    "${setPasswords}"
  ];

  sops.secrets."samba_passwords/michael" = {
    restartUnits = [ "samba-smbd.service" ];
  };
}

I also noticed that NixOS does not create a group for each user by default, but I am used to managing my permissions like that. We can easily declare a group like so:

  users.groups.michael = {
    gid = 1000; # for consistency with storage3
  };
  users.users.michael = {
    extraGroups = [
      "wheel" # Enable ‘sudo’ for the user.
      "docker"
      # By default, NixOS does not add users to their own group:
      # https://github.com/NixOS/nixpkgs/issues/198296
      "michael"
    ];
  };

N6. NixOS rrsync

The Docker version (ported from Flatcar Linux) looks like this:

  users.users.root.openssh.authorizedKeys.keys = [
    ''command="${pkgs.docker}/bin/docker run --log-driver none -i -e SSH_ORIGINAL_COMMAND -v /srv/backup/midna:/srv/backup/midna stapelberg/docker-rsync /srv/backup/midna" ssh-rsa AAAAB3Npublickey root@midna''
  ];

To use rrsync from NixOS, I changed the configuration like so:

  users.users.root.openssh.authorizedKeys.keys = [
    ''command="${pkgs.rrsync}/bin/rrsync /srv/backup/midna" ssh-rsa AAAAB3Npublickey root@midna''
  ];

N7. sync.pl script

The Docker version (ported from Flatcar Linux) looks like this:

  users.users.root.openssh.authorizedKeys.keys = [
    ''command="${pkgs.docker}/bin/docker run --log-driver none -i -e SSH_ORIGINAL_COMMAND -v /srv/data:/srv/data -v /root/.ssh:/root/.ssh:ro -v /etc/ssh:/etc/ssh:ro -v /etc/static/ssh:/etc/static/ssh:ro -v /nix/store:/nix/store:ro stapelberg/docker-sync",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3Npublickey sync@dr''
  ];

I wanted to stop managing the following Dockerfile to ship sync.pl:

FROM debian:stable

# Install full perl for Data::Dumper
RUN apt-get update \
    && apt-get install -y rsync ssh perl

ADD sync.pl /usr/bin/

ENTRYPOINT ["/usr/bin/sync.pl"]

To get rid of the Docker container, I translated the sync.pl file into a Nix expression that writes the sync.pl Perl script to the Nix store:

{ pkgs }:

# For string literal escaping rules (''${), see:
# https://nix.dev/manual/nix/2.26/language/string-literals#string-literals

# For writers.writePerlBin, see https://wiki.nixos.org/wiki/Nix-writers

pkgs.writers.writePerlBin "syncpl" { libraries = []; } ''
# This script is run via ssh from dornröschen.
use strict;
use warnings;
use Data::Dumper;

if (my ($destination) = ($ENV{SSH_ORIGINAL_COMMAND} =~ /^([a-z0-9.]+)$/)) {
    print STDERR "rsync version: " . `${pkgs.rsync}/bin/rsync --version` . "\n\n";
    my @rsync = (
        "${pkgs.rsync}/bin/rsync",
        "-e",
        "ssh",
        "--max-delete=-1",
        "--verbose",
        "--stats",
        # Intentionally not setting -X for my data sync,
        # where there are no full system backups; mostly media files.
        "-ax",
        "--ignore-existing",
        "--omit-dir-times",
        "/srv/data/",
        "''${destination}:/",
    );
    print STDERR "running: " . Dumper(\@rsync) . "\n";
    exec @rsync;
} else {
    print STDERR "Could not parse SSH_ORIGINAL_COMMAND.\n";
}
''

I can then reference this file by importing it in my configuration.nix and pointing it to the pkgs expression of my NixOS configuration:

{ modulesPath, lib, pkgs, ... }:

let
  syncpl = import ./syncpl.nix { pkgs = pkgs; };
in {
  imports = [ ./hardware-configuration.nix ];

  users.users.root.openssh.authorizedKeys.keys = [
    ''command="${syncpl}/bin/syncpl",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3Npublickey sync@dr''
  ];

  # For interactive usage (when debugging):
  environment.systemPackages = [ syncpl ];

  # …
}

This works, but is it the best approach? Here are some thoughts:

By managing this script in a Nix expression, I can no longer use my editor’s Perl support.
- I could probably also keep sync.pl as a separate file and use string interpolation in my Nix expression to inject an absolute path to the rsync binary into the script.
Another alternative would be to add a wrapper script to my Nix expression which ensures that $PATH contains rsync and then the script wouldn’t need an absolute path anymore.
For small glue scripts like this one, I consider it easier to manage the contents “inline” in the Nix expression, because it means one fewer file in my config directory.

N8. Sharing configs

I want to configure all my NixOS systems such that my user settings are identical everywhere.

To achieve that, I can extract parts of my configuration.nix into a user-settings.nix and then declare an accompanying flake.nix that provides this expression as an output.

After publishing these files in a git repository, I can reference said repository in my flake.nix:

{
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-25.05";
    stapelbergnix.url = "github:stapelberg/nix";
  };

  outputs =
    {
      self,
      nixpkgs,
	  stapelbergnix,
    }:
    let
      system = "x86_64-linux";
      pkgs = import nixpkgs {
        inherit system;
        config.allowUnfree = false;
      };
    in
    {
      nixosConfigurations.storage2 = nixpkgs.lib.nixosSystem {
        inherit system;
        inherit pkgs;
        modules = [
          ./configuration.nix
          stapelbergnix.lib.userSettings
          # Not on this machine; We have our own networking config:
          # stapelbergnix.lib.systemdNetwork
          # Use systemd-boot as bootloader
          stapelbergnix.lib.systemdBoot
        ];
      };
      formatter.${system} = pkgs.nixfmt-tree;
    };
}

Everything declared in the user-settings.nix can now be removed from configuration.nix!

N9. Trying immich!

One of the motivating reasons for switching away from CoreOS/Flatcar was that I couldn’t try Immich, so let’s give it a shot on NixOS:

  services.immich = {
    enable = true;
    host = "10.0.0.252";
    port = 2283;
    openFirewall = true;
    mediaLocation = "/srv/immich";
  };

  # Because /srv is a separate file system, we need to declare:
  systemd.services."immich-server" = {
    unitConfig.RequiresMountsFor = [ "/srv" ];
    wantedBy = [ "srv.mount" ];
  };

Conclusion

You can find the full configuration directory on GitHub.

I am pretty happy with this NixOS setup! Previously (with CoreOS/Flatcar), I could declaratively manage my base system, but had to manage tons of Docker containers in addition. With NixOS, I can declaratively manage everything (or as much as makes sense).

Custom configuration like my SSH+rsync-based backup infrastructure can be expressed cleanly, in one place, and structured at the desired level of abstraction/reuse.

If you’re considering managing at least one other system with NixOS, I would recommend it! One of my follow-up projects is to convert storage3 (my other NAS build) from Ubuntu Server to NixOS as well to cut down on manual management. Being able to just copy the entire config to set up another system, or try out an idea in a throwaway VM, is just such a nice workflow 🥰

…but if you have just a single system to manage, probably all of this is too complicated.

Ich hatte viele Jahre eine kleine Nextcloud Instanz in meinem Heimnetzwerk im Einsatz. Insgesamt war ich damit glücklich, leider gab es über die letzten zwei Jahre hinweg immer wieder Probleme mit dem Notizen Plugin. Eine Funktionalität, die mir sehr wichtig ist.

Vor einem Jahr kam dann die Erinnerung hoch, dass ich früher, als Cloud dem Marketingsprech noch fremd und auch sonst noch alles besser war und so, meinerseits gobby dafür genutzt wurde. Der Client ein einfacher GTK basierter Texteditor, welcher für Linux, BSD, MacOS und Windows verfügbar ist. Der Server ein schmaler daemon, welcher keine besonderen Rechte benötigt so lange Ports > 1024 genutzt werden. Lange Zeit das wichtigste Feature, es können mehrere Clients gleichzeitig an einer Datei arbeiten und einen Chat gab es dazu. Clients werden Farben zugeordnet, mit denen die Eingaben hinterlegt werden wodurch transparent war, wer welche Änderungen am Text vorgenommen hat.

Die Konfiguration des Servers geschieht im HOME Verzeichnis des ausführenden Benutzers ${HOME}/.config/infinoted.conf oder kann via Argument --config-file vorgegeben werden. Einfache meinerseits im Heimnetz genutzte Konfiguration, welche alle 30 Sekunden die Dateien (falls geändert) speichert und regelmäßig in einer parallelen Ordnerstruktur als Plaintext Dateien sichert.

[infinoted]
key-file=/home/Benutzer/infinoted-key.pem
certificate-file=/home/Benutzer/infinoted-cert.pem
security-policy=require-tls
root-directory=/home/Benutzer/gobby
password=DasPlaintextPasswortKommtHierHin.
plugins=note-text;autosave;directory-sync

[autosave]
interval=30

[directory-sync]
directory=/home/Benutzer/gobby-plaintext-sync
interval=300

Falls man den Server nicht alleine nutzen möchte, kann man statt einem zentralen Passwort auch User&Password oder zertifikatsbasierte Authentifizierung einrichten. Soll der daemon auch aus dem Internet heraus erreichbar sein, muss man die Konfiguration (Code) natürlich erweitern und natürlich auf OS Ebene den daemon und den Zugruff auf die Daten besser eingrenzen.

Handhabung via systemd Unit ist unkompliziert, wenn man mehrere Instanzen auf einem Server laufen lassen möchte, kann man wenn man für die Bennenung der Konfigurationsdateien ein passendes Schema nutzt einfach mit Templates arbeiten.

[Unit]
Description=Local infinoted for gobby texteditor.
After=network-online.target

[Service]
KillMode=process
User=Benutzer
Group=Benutzer
ExecStart=/usr/bin/infinoted-0.7

[Install]
WantedBy=multi-user.target

For one of my network storage PC builds, I was looking for an alternative to Flatcar Container Linux and tried out NixOS again (after an almost 10 year break). There are many ways to install NixOS, and in this article I will outline how I like to install NixOS on physical hardware or virtual machines: over the network and fully declaratively.

Introduction: Declarative?

The term declarative means that you describe what should be accomplished, not how. For NixOS, that means you declare what software you want your system to include (add to config option environment.systemPackages, or enable a module) instead of, say, running apt install.

A nice property of the declarative approach is that your system follows your configuration, so by reverting a configuration change, you can cleanly revert the change to the system as well.

I like to manage declarative configuration files under version control, typically with Git.

When I originally set up my current network storage build, I chose CoreOS (later Flatcar Container Linux) because it was an auto-updating base system with a declarative cloud-init config.

Ways of installing NixOS

Graphical Installer: Only for Desktops

The NixOS manual’s “Installation” section describes a graphical installer (“for desktop users”, based on the Calamares system installer and added in 2022) and a manual installer.

With the graphical installer, it’s easy to install NixOS to disk: just confirm the defaults often enough and you’ll end up with a working system. But there are some downsides:

You need to manually enable SSH after the installation — locally, not via the network.
The graphical installer generates an initial NixOS configuration for you, but there is no way to inject your own initial NixOS configuration.

The graphical installer is clearly not meant for remote installation or automated installation.

Manual Installation

The manual installer on the other hand is too manual for my taste: expand “Example 2” and “Example 3” in the NixOS manual’s Installation summary section to get an impression. To be clear, the steps are very doable, but I don’t want to install a system this way in a hurry. For one, manual procedures are prone to mistakes under stress. And also, copy & pasting commands interactively is literally the opposite of writing declarative configuration files.

Network Installation: nixos-anywhere

Ideally, I would want to perform most of the installation from the comfort of my own PC, meaning the installer must be usable over the network. Also, I want the machine to come up with a working initial NixOS configuration immediately after installation (no manual steps!).

Luckily, there is a (community-provided) solution: nixos-anywhere. You take care of booting a NixOS installer, then run a single command and nixos-anywhere will SSH into that installer, partition your disk(s) and install NixOS to disk. Notably, nixos-anywhere is configured declaratively, so you can repeat this step any time.

(I know that nixos-anywhere can even SSH into arbitrary systems and kexec-reboot them into a NixOS installer, which is certainly a cool party trick, but I like the approach of explicitly booting an installer better as it seems less risky and more generally applicable/repeatable to me.)

Setup: Installing Nix

I want to use NixOS for one of my machines, but not (currently) on my main desktop PC.

Hence, I installed only the nix tool (for building, even without running NixOS) on Arch Linux:

% sudo pacman -S nix
% sudo groupadd -r nixbld
% for n in $(seq 1 24); do sudo useradd -c "Nix build user $n" \
    -d /var/empty -g nixbld -G nixbld -M -N -r -s "$(which nologin)" \
    nixbld$n; done

Now, running nix-shell -p hello should drop you in a new shell in which the GNU hello package is installed:

% export NIX_PATH=nixpkgs=channel:nixos-25.05
% nix-shell -p hello
hello

[nix-shell:/tmp]$ hello
Hello, world!

By the way, the Nix page on the Arch Linux wiki explains how to use nix to install packages, but that’s not what I am interested in: I only want to remotely manage NixOS systems.

Building your own installer

Previously, I said “you take care of booting a NixOS installer”, and that’s easy enough: write the ISO image to a USB stick and boot your machine from it (or select the ISO and boot your VM).

But before we can log in remotely via SSH, we need to manually set a password. I also need to SSH with the TERM=xterm environment variable because the termcap file of rxvt-unicode (my preferred terminal) is not included in the default NixOS installer environment. Similarly, my configured locales do not work and my preferred shell (Zsh) is not available.

Wouldn’t it be much nicer if the installer was pre-configured with a convenient environment?

With other Linux distributions, like Debian, Fedora or Arch Linux, I wouldn’t attempt to re-build an official installer ISO image. I’m sure their processes and tooling work well, but I am also sure it’s one extra thing I would need to learn, debug and maintain.

But building a NixOS installer is very similar to configuring a regular NixOS system: same configuration, same build tool. The procedure is documented in the official NixOS wiki.

I copied the customizations I would typically put into configuration.nix, imported the installation-cd-minimal.nix module from nixpkgs and put the result in the iso.nix file:

{ config, pkgs, ... }:

{
  imports = [
    <nixpkgs/nixos/modules/installer/cd-dvd/installation-cd-minimal.nix>
    <nixpkgs/nixos/modules/installer/cd-dvd/channel.nix>
  ];

  i18n.supportedLocales = [
    "en_DK.UTF-8/UTF-8"
    "de_DE.UTF-8/UTF-8"
    "de_CH.UTF-8/UTF-8"
    "en_US.UTF-8/UTF-8"
  ];
  i18n.defaultLocale = "en_US.UTF-8";

  security.sudo.wheelNeedsPassword = false;
  users.users.michael = {
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5secret"
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5key"
    ];

    isNormalUser = true;
    description = "Michael Stapelberg";
    extraGroups = [ "wheel" ];
    initialPassword = "SGZ3odMZIesxTuh2Y2pUaJA";  # random for this post
    shell = pkgs.zsh;
    packages = with pkgs; [];
  };

  environment.systemPackages = with pkgs; [
    git  # for checking out github.com/stapelberg/configfiles
    rsync
    zsh
    vim
    emacs
    wget
    curl
    rxvt-unicode  # for terminfo
    lshw
  ];

  programs.zsh.enable = true;
  services.openssh.enable = true;

  # This value determines the NixOS release from which the default
  # settings for stateful data, like file locations and database versions
  # on your system were taken. It‘s perfectly fine and recommended to leave
  # this value at the release version of the first install of this system.
  # Before changing this value read the documentation for this option
  # (e.g. man configuration.nix or on https://nixos.org/nixos/options.html).
  system.stateVersion = "25.05"; # Did you read the comment?
}

To build the ISO image, I set the NIX_PATH environment variable to point nix-build(1) to the iso.nix file and to select the upstream channel for NixOS 25.05:

% export NIX_PATH=nixos-config=$PWD/iso.nix:nixpkgs=channel:nixos-25.05
% nix-build '<nixpkgs/nixos>' -A config.system.build.isoImage

After about 1.5 minutes on my 2025 high-end Linux PC, the installer ISO can be found in result/iso/nixos-minimal-25.05.802216.55d1f923c480-x86_64-linux.iso (1.46 GB in size in my case).

Enabling Nix Flakes

Unfortunately, the nix project has not yet managed to enable the “experimental” new command-line interface (CLI) by default yet, despite 5+ years of being available, so we need to create a config file and enable the modern nix-command interface:

% mkdir -p ~/.config/nix
% echo 'experimental-features = nix-command flakes' >> ~/.config/nix/nix.conf

How can you tell old from new? The old commands are hyphenated (nix-build), the new ones are separated by a blank space (nix build).

You’ll notice I also enabled Nix flakes, which I use so that my nix builds are hermetic and pinned to a certain revision of nixpkgs and any other nix modules I want to include in my build. I like to compare flakes to version lock file in other programming environments: the idea is that building the system in 5 months will yield the same result as it does today.

To verify that flakes work, run nix shell (not nix-shell):

% nix shell nixpkgs#hello
/tmp 2 % hello
Hello, world!

(Re-)Installation Steps

For reference, here is the configuration I use to create a new VM for NixOS in Proxmox. The most important setting is bios=ovmf (= UEFI boot, which is not the default), so that I can use the same boot loader configuration on physical machines as in VMs:

Before we can boot our (unsigned) installer, we need to enter the UEFI setup and disable Secure Boot. Note that Proxmox enables Secure Boot by default, for example.

Then, boot the custom installer ISO on the target system, and ensure ssh michael@nixos.lan works without prompting for a password.

Declare a flake.nix with the following content:

{
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-25.05";

    disko.url = "github:nix-community/disko";
    # Use the same version as nixpkgs
    disko.inputs.nixpkgs.follows = "nixpkgs";
  };

  outputs =
    {
      nixpkgs,
      disko,
      ...
    }:
    let
      system = "x86_64-linux";
      pkgs = import nixpkgs {
        inherit system;
        config.allowUnfree = false;
      };
    in
    {
      nixosConfigurations.zammadn = nixpkgs.lib.nixosSystem {
        inherit system;
        inherit pkgs;
        modules = [
          disko.nixosModules.disko
          ./configuration.nix
        ];
      };
      formatter.${system} = pkgs.nixfmt-tree;
    };
}

Declare your disk config in disk-config.nix:

disk-config.nix

{ lib, ... }:

{
  disko.devices = {
    disk = {
      main = {
        device = lib.mkDefault "/dev/sda";
        type = "disk";
        content = {
          type = "gpt";
          partitions = {
            ESP = {
              type = "EF00";
              size = "500M";
              content = {
                type = "filesystem";
                format = "vfat";
                mountpoint = "/boot";
                mountOptions = [ "umask=0077" ];
              };
            };
            root = {
              size = "100%";
              content = {
                type = "filesystem";
                format = "ext4";
                mountpoint = "/";
              };
            };
          };
        };
      };
    };
  };
}

Declare your desired NixOS config in configuration.nix:

{ modulesPath, lib, pkgs, ... }:

{
  imports =
    [
      (modulesPath + "/installer/scan/not-detected.nix")
      ./hardware-configuration.nix
      ./disk-config.nix
    ];

  # Adding michael as trusted user means
  # we can upgrade the system via SSH (see Makefile).
  nix.settings.trusted-users = [ "michael" "root" ];
  # Clean the Nix store every week.
  nix.gc = {
    automatic = true;
    dates = "weekly";
    options = "--delete-older-than 7d";
  };

  boot.loader.systemd-boot = {
    enable = true;
    configurationLimit = 10;
  };
  boot.loader.efi.canTouchEfiVariables = true;

  networking.hostName = "zammadn";
  time.timeZone = "Europe/Zurich";

  # Use systemd for networking
  services.resolved.enable = true;
  networking.useDHCP = false;
  systemd.network.enable = true;

  systemd.network.networks."10-e" = {
    matchConfig.Name = "e*";  # enp9s0 (10G) or enp8s0 (1G)
    networkConfig = {
      IPv6AcceptRA = true;
      DHCP = "yes";
    };
  };

  i18n.supportedLocales = [
    "en_DK.UTF-8/UTF-8"
    "de_DE.UTF-8/UTF-8"
    "de_CH.UTF-8/UTF-8"
    "en_US.UTF-8/UTF-8"
  ];
  i18n.defaultLocale = "en_US.UTF-8";

  users.mutableUsers = false;
  security.sudo.wheelNeedsPassword = false;
  users.users.michael = {
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5secret"
      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5key"
    ];

    isNormalUser = true;
    description = "Michael Stapelberg";
    extraGroups = [ "networkmanager" "wheel" ];
    initialPassword = "install";  # TODO: change!
    shell = pkgs.zsh;
    packages = with pkgs; [];
  };

  environment.systemPackages = with pkgs; [
    git  # for checking out github.com/stapelberg/configfiles
    rsync
    zsh
    vim
    emacs
    wget
    curl
  ];

  programs.zsh.enable = true;

  services.openssh.enable = true;

  # This value determines the NixOS release from which the default
  # settings for stateful data, like file locations and database versions
  # on your system were taken. It‘s perfectly fine and recommended to leave
  # this value at the release version of the first install of this system.
  # Before changing this value read the documentation for this option
  # (e.g. man configuration.nix or on https://nixos.org/nixos/options.html).
  system.stateVersion = "25.05"; # Did you read the comment?
}

…and lock it:

% nix flake lock

Using nixos-anywhere, fetch the hardware-configuration.nix from the installer and install NixOS to disk:

% nix run github:nix-community/nixos-anywhere -- \
  --flake .#zammadn \
  --generate-hardware-config nixos-generate-config ./hardware-configuration.nix \
  --target-host michael@nixos.lan

After about one minute, my VM was installed and rebooted!

Tip: Last month, I had to temporarily pin to the latest released version (1.9.0) because of issue nixos-anywhere#510 like so:

% nix run github:nix-community/nixos-anywhere/1.9.0 -- \
  […same as above…]

Full nixos-anywhere installation transcript, if you’re curious

% nix run github:nix-community/nixos-anywhere -- \      
  --flake .#wiki \                                                                 
  --generate-hardware-config nixos-generate-config ./hardware-configuration.nix \
  --target-host michael@10.25.0.87                                               
Warning: Identity file /tmp/tmp.BT4E7i6eqJ/nixos-anywhere not accessible: No such file or directory.
### Uploading install SSH keys ###
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/tmp/tmp.BT4E7i6eqJ/nixos-anywhere.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.

Number of key(s) added: 1

Now try logging into the machine, with: "ssh -i /tmp/tmp.BT4E7i6eqJ/nixos-anywhere -o 'IdentitiesOnly=no' -o 'ConnectTimeout=10' -o 'IdentitiesOnly=yes' -o 'UserKnownHostsFile=/dev/null' -o 'StrictHostKeyChecking=no' 'michael@10.25.0.87'"
and check to make sure that only the key(s) you wanted were added.

### Gathering machine facts ###
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
### Generating hardware-configuration.nix using nixos-generate-config ###
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
~/machines/wiki ~/machines/wiki
~/machines/wiki
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
warning: Git tree '/home/michael/machines' is dirty
warning: Git tree '/home/michael/machines' is dirty
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
Connection to 10.25.0.87 closed.
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
### Formatting hard drive with disko ###
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
umount: /mnt: not mounted
++ realpath /dev/sda
+ disk=/dev/sda
+ lsblk -a -f
NAME   FSTYPE   FSVER            LABEL                      UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0  squashfs 4.0                                                                                    0   100% /nix/.ro-store
loop1                                                                                                           
loop2                                                                                                           
loop3                                                                                                           
loop4                                                                                                           
loop5                                                                                                           
loop6                                                                                                           
loop7                                                                                                           
sda                                                                                                             
├─sda1 vfat     FAT16                                       83DA-E750                                           
└─sda2 ext4     1.0                                         b136d6fd-d060-4b61-90fb-a8c1f9492f6e                
sr0    iso9660  Joliet Extension nixos-minimal-25.05-x86_64 1980-01-01-00-00-00-00                     0   100% /iso
+ lsblk --output-all --json
++ dirname /nix/store/fpwn44vygjj6bfn8s1jj9p8yh6jhfxni-disk-deactivate/disk-deactivate
+ bash -x
+ jq -r -f /nix/store/fpwn44vygjj6bfn8s1jj9p8yh6jhfxni-disk-deactivate/zfs-swap-deactivate.jq
+ lsblk --output-all --json
+ bash -x
++ dirname /nix/store/fpwn44vygjj6bfn8s1jj9p8yh6jhfxni-disk-deactivate/disk-deactivate
+ jq -r --arg disk_to_clear /dev/sda -f /nix/store/fpwn44vygjj6bfn8s1jj9p8yh6jhfxni-disk-deactivate/disk-deactivate.jq
+ set -fu
+ wipefs --all -f /dev/sda1
/dev/sda1: 8 bytes were erased at offset 0x00000036 (vfat): 46 41 54 31 36 20 20 20
/dev/sda1: 1 byte was erased at offset 0x00000000 (vfat): eb
/dev/sda1: 2 bytes were erased at offset 0x000001fe (vfat): 55 aa
+ wipefs --all -f /dev/sda2
/dev/sda2: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef
++ type zdb
++ zdb -l /dev/sda
++ sed -nr 's/ +name: '\''(.*)'\''/\1/p'
+ zpool=
+ [[ -n '' ]]
+ unset zpool
++ lsblk /dev/sda -l -p -o type,name
++ awk 'match($1,"raid.*") {print $2}'
+ md_dev=
+ [[ -n '' ]]
+ wipefs --all -f /dev/sda
/dev/sda: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 8 bytes were erased at offset 0xc7ffffe00 (gpt): 45 46 49 20 50 41 52 54
/dev/sda: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
+ dd if=/dev/zero of=/dev/sda bs=440 count=1
1+0 records in
1+0 records out
440 bytes copied, 0.000306454 s, 1.4 MB/s
+ lsblk -a -f
NAME  FSTYPE   FSVER            LABEL                      UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0 squashfs 4.0                                                                                    0   100% /nix/.ro-store
loop1                                                                                                          
loop2                                                                                                          
loop3                                                                                                          
loop4                                                                                                          
loop5                                                                                                          
loop6                                                                                                          
loop7                                                                                                          
sda                                                                                                            
sr0   iso9660  Joliet Extension nixos-minimal-25.05-x86_64 1980-01-01-00-00-00-00                     0   100% /iso
++ mktemp -d
+ disko_devices_dir=/tmp/tmp.YvWbz8ZKHk
+ trap 'rm -rf "$disko_devices_dir"' EXIT
+ mkdir -p /tmp/tmp.YvWbz8ZKHk
+ destroy=1
+ device=/dev/sda
+ imageName=main
+ imageSize=2G
+ name=main
+ type=disk
+ device=/dev/sda
+ efiGptPartitionFirst=1
+ type=gpt
+ blkid /dev/sda
+ sgdisk --clear /dev/sda
Creating new GPT entries in memory.
The operation has completed successfully.
+ sgdisk --align-end --new=1:0:+500M --partition-guid=1:R --change-name=1:disk-main-ESP --typecode=1:EF00 /dev/sda
The operation has completed successfully.
+ partprobe /dev/sda
+ udevadm trigger --subsystem-match=block
+ udevadm settle --timeout 120
+ sgdisk --align-end --new=2:0:-0 --partition-guid=2:R --change-name=2:disk-main-root --typecode=2:8300 /dev/sda
The operation has completed successfully.
+ partprobe /dev/sda
+ udevadm trigger --subsystem-match=block
+ udevadm settle --timeout 120
+ device=/dev/disk/by-partlabel/disk-main-ESP
+ extraArgs=()
+ declare -a extraArgs
+ format=vfat
+ mountOptions=('umask=0077')
+ declare -a mountOptions
+ mountpoint=/boot
+ type=filesystem
+ blkid /dev/disk/by-partlabel/disk-main-ESP
+ grep -q TYPE=
+ mkfs.vfat /dev/disk/by-partlabel/disk-main-ESP
mkfs.fat 4.2 (2021-01-31)
+ device=/dev/disk/by-partlabel/disk-main-root
+ extraArgs=()
+ declare -a extraArgs
+ format=ext4
+ mountOptions=('defaults')
+ declare -a mountOptions
+ mountpoint=/
+ type=filesystem
+ blkid /dev/disk/by-partlabel/disk-main-root
+ grep -q TYPE=
+ mkfs.ext4 /dev/disk/by-partlabel/disk-main-root
mke2fs 1.47.2 (1-Jan-2025)
Discarding device blocks: done                            
Creating filesystem with 12978688 4k blocks and 3245872 inodes
Filesystem UUID: 57975635-9165-4895-93ea-72053294a185
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (65536 blocks): done
Writing superblocks and filesystem accounting information: done   

+ set -efux
+ destroy=1
+ device=/dev/sda
+ imageName=main
+ imageSize=2G
+ name=main
+ type=disk
+ device=/dev/sda
+ efiGptPartitionFirst=1
+ type=gpt
+ destroy=1
+ device=/dev/sda
+ imageName=main
+ imageSize=2G
+ name=main
+ type=disk
+ device=/dev/sda
+ efiGptPartitionFirst=1
+ type=gpt
+ device=/dev/disk/by-partlabel/disk-main-root
+ extraArgs=()
+ declare -a extraArgs
+ format=ext4
+ mountOptions=('defaults')
+ declare -a mountOptions
+ mountpoint=/
+ type=filesystem
+ findmnt /dev/disk/by-partlabel/disk-main-root /mnt/
+ mount /dev/disk/by-partlabel/disk-main-root /mnt/ -t ext4 -o defaults -o X-mount.mkdir
+ destroy=1
+ device=/dev/sda
+ imageName=main
+ imageSize=2G
+ name=main
+ type=disk
+ device=/dev/sda
+ efiGptPartitionFirst=1
+ type=gpt
+ device=/dev/disk/by-partlabel/disk-main-ESP
+ extraArgs=()
+ declare -a extraArgs
+ format=vfat
+ mountOptions=('umask=0077')
+ declare -a mountOptions
+ mountpoint=/boot
+ type=filesystem
+ findmnt /dev/disk/by-partlabel/disk-main-ESP /mnt/boot
+ mount /dev/disk/by-partlabel/disk-main-ESP /mnt/boot -t vfat -o umask=0077 -o X-mount.mkdir
+ rm -rf /tmp/tmp.YvWbz8ZKHk
Connection to 10.25.0.87 closed.
### Uploading the system closure ###
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
copying path '/nix/store/k64q0bbrf8kxvcx1zlvhphcshzqn2xg6-acl-2.3.2-man' from 'https://cache.nixos.org'...
copying path '/nix/store/3rnsaxgfam1df8zx6lgcjbzrxhcg1ibg-acl-2.3.2-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/ircpdw4nslfzmlpds59pn9qlak8gn81r-attr-2.5.2-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/mrxc0jlwhw95lgzphd78s6w33whhkfql-attr-2.5.2-man' from 'https://cache.nixos.org'...
copying path '/nix/store/qm7ybllh3nrg3sfllh7n2f6llrwbal58-bash-completion-2.16.0' from 'https://cache.nixos.org'...
copying path '/nix/store/3frg3li12mwq7g4fpmgkjv43x5bqad7d-bash-interactive-5.2p37-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/88cs6k2j021mh2ir1dzsl6m8vqgydyiw-bash-interactive-5.2p37-info' from 'https://cache.nixos.org'...
copying path '/nix/store/s3zz5nasd7qr894a8jrp6fy52pdrz2f1-bash-interactive-5.2p37-man' from 'https://cache.nixos.org'...
copying path '/nix/store/azy34jpyn6sskplqzpbcs6wgrajkkqy0-bind-9.20.9-man' from 'https://cache.nixos.org'...
copying path '/nix/store/g28l15mbdbig59n102zd0ardsfisiw32-binfmt_nixos.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/k5r6p8gvf18l9dd9kq1r22ddf7ykfim2-build-vms.nix' from 'https://cache.nixos.org'...
copying path '/nix/store/dxhfmzg1dhyag26r70xns91f8078vq82-alsa-firmware-1.2.4-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/d46ilc6gzd1piyjfm9sbrl7pq3b3k0hg-busybox-1.36.1' from 'https://cache.nixos.org'...
copying path '/nix/store/yq76x7ha0rv3mn9vxrar53zlkmxlkdas-bzip2-1.0.8-man' from 'https://cache.nixos.org'...
copying path '/nix/store/9wvnmd2mr2qr8civvznnfi6s773fjvfh-coreutils-full-9.7-info' from 'https://cache.nixos.org'...
copying path '/nix/store/innps8d9bl9jikd3nsq8bd5irgrlay6f-curl-8.13.0-man' from 'https://cache.nixos.org'...
copying path '/nix/store/6yiazrx84xj8m8xqal238g3mzglvwid2-dbus-1.14.10-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/4bys54210khcipi91d6ivfz4g5qx33kh-dbus-1.14.10-man' from 'https://cache.nixos.org'...
copying path '/nix/store/zh5iazbs69x4irfdml5fzbh9nm05spgb-dejavu-fonts-minimal-2.37' from 'https://cache.nixos.org'...
copying path '/nix/store/55wbmnssa48mi96pbaihz9wr4a44vxsd-diffutils-3.12-info' from 'https://cache.nixos.org'...
copying path '/nix/store/qxk9122p34qwivq20k154jflwxjjjxb3-dns-root-data-2025-04-14' from 'https://cache.nixos.org'...
copying path '/nix/store/nqzrl9jhqs4cdxk6bpx54wfwi14x470f-e2fsprogs-1.47.2-info' from 'https://cache.nixos.org'...
copying path '/nix/store/b0qk1rsi8w675h1514l90p55iacswy5i-e2fsprogs-1.47.2-man' from 'https://cache.nixos.org'...
copying path '/nix/store/33ka30bacgl8nm7g7rcf2lz4n3hpa791-etc-bash_logout' from 'https://cache.nixos.org'...
copying path '/nix/store/0sl4azq1vls6f7lfjpjgpn9gpmwxh3a5-etc-fuse.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/m4xpifh68ayw6pn7imyiah5q8i03ibzx-etc-host.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/qdhp1g45sqkz5limyh3pr54jr0vzrhyg-etc-lsb-release' from 'https://cache.nixos.org'...
copying path '/nix/store/61z4n7pkrbhhnahpvndvpc2iln06kcl3-etc-lvm-lvm.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/l75amyv04p2ldiz6iv5cmlm03m417yfd-etc-man_db.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/yb8n9alg0flvl93842savj8fk880a5s8-etc-modprobe.d-nixos.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/v5zxfkpwma99vvbnwh7pv3qjvv09q9mf-etc-netgroup' from 'https://cache.nixos.org'...
copying path '/nix/store/cb8dadanahyrgyh4yrd02j1pn4ipg3h1-etc-nscd.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/ixzrf8qqzdp889kffwhi5l1i5b906wm2-etc-nsswitch.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/mw3qf7jsf2cr6bdh2dwhsfaj46ddvdj4-etc-systemd-coredump.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/ysxak9fplmg53wd40z86bacviss02wxj-etc-resolvconf.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/w0027gbp2ppnzpakjqdsj04k1qnv8xai-etc-systemd-journald.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/73qr9mvgrkk9g351h1560rqblpv8bkli-etc-systemd-logind.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/w8r9xylr9a1bd2glfp4zdwxiq8z2bhxb-etc-systemd-networkd.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/i1v3l8mmgr1zni58zsdgrf19xz5wpihs-etc-systemd-oomd.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/42mdjpbx4dablvbkj4l75xfjjlhpyb7a-etc-systemd-resolved.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/g2d3zjbsa94jdqybcwbldzn3w98pwzhk-etc-systemd-sleep.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/1wi887sd535dk4l4s0w7hp822fdys18j-etc-systemd-system-preset-00-nixos.preset' from 'https://cache.nixos.org'...
copying path '/nix/store/r4cjphi2kzkyvkc33y7ik3h8z1l5zs2q-etc-systemd-timesyncd.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/n5y58mvq44mibwxkzzjb646v0nck9psd-etc-systemd-user-preset-00-nixos.preset' from 'https://cache.nixos.org'...
copying path '/nix/store/7d2j36mn359g17s2qaxsb7fjd2bm4s7p-etc-systemd-user.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/ziyrzq721iziyhvlchvg4zllcdr0rbd4-etc-zprofile' from 'https://cache.nixos.org'...
copying path '/nix/store/6zl92vca58p27i20dck95j27lvj5lv16-etc-zinputrc' from 'https://cache.nixos.org'...
copying path '/nix/store/y7y1v7l88mxkljbijs7nwzm1gcg9yrjw-extra-utils' from 'https://cache.nixos.org'...
copying path '/nix/store/kkbfwys01v37rxcrahc79mzw7bqqg1ha-X-Restart-Triggers-systemd-journald' from 'https://cache.nixos.org'...
copying path '/nix/store/15k9rkd7sqzwliiax8zqmbk9sxbliqmd-X-Restart-Triggers-systemd-journald-' from 'https://cache.nixos.org'...
copying path '/nix/store/08c95zkcyr5d4gcb2nzldf6a5l791zsl-fc-10-nixos-rendering.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/fnrpg6pljxzbwz5f2wbiayirb4z63rid-fc-52-nixos-default-fonts.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/hx4rm1z8sjh6s433sfxfjjwapr1r2lnm-X-Reload-Triggers-systemd-resolved' from 'https://cache.nixos.org'...
copying path '/nix/store/045cq354ckg28php9gf0267sa4qgywj9-X-Restart-Triggers-systemd-timesyncd' from 'https://cache.nixos.org'...
copying path '/nix/store/xj6dycqkvs35yla01gd2mmrrpw1d1606-fc-53-nixos-reject-type1.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/c1l35xhz88v0hz3bfnzwi7k3pirk89gx-fc-53-no-bitmaps.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/izcym87m13m4nhjbxr2b2fp0r6wpl1s6-fontconfig-2.16.0' from 'https://cache.nixos.org'...
copying path '/nix/store/b5qqfs0s3fslirivph8niwdxh0r0qm4g-fc-cache' from 'https://cache.nixos.org'...
copying path '/nix/store/yjab7vlimxzqpndjdqzann33i34x6pyy-findutils-4.10.0-info' from 'https://cache.nixos.org'...
copying path '/nix/store/sgxf74s67kbx0kx38hqjzpjbrygcnl81-fuse-2.9.9-man' from 'https://cache.nixos.org'...
copying path '/nix/store/3p531g8jpnfjl6y0f4033g3g2f14s32y-gawk-5.3.2-info' from 'https://cache.nixos.org'...
copying path '/nix/store/vp5ra8m1sg9p3xgnz3zd7mi5mp0vdy25-fuse-3.16.2-man' from 'https://cache.nixos.org'...
copying path '/nix/store/ndir5b1ag9pk4dyrpvhiidaqqg1xjdqm-gawk-5.3.2-man' from 'https://cache.nixos.org'...
copying path '/nix/store/6hqzbvz50bm87hcj4qfn51gh7arxj8a6-gcc-14.2.1.20250322-libgcc' from 'https://cache.nixos.org'...
copying path '/nix/store/7dfxlvdhr5g57b1v8gxwpa2gs7i9g3y5-git-2.49.0-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/ikhb97s6a22dn21lhxlzhambsmisrvff-gnugrep-3.11-info' from 'https://cache.nixos.org'...
copying path '/nix/store/hgx3ai0sm533zfd9iqi5nz5vwc50sprm-fc-00-nixos-cache.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/i65zra2i21y5khnsnvl0pvd5rkvw5qhl-gnused-4.9-info' from 'https://cache.nixos.org'...
copying path '/nix/store/10p1z2bqsw0c6r5c5f59yn4lnl82lqxi-gnutar-1.35-info' from 'https://cache.nixos.org'...
copying path '/nix/store/a9fcrsva5nw1y3nqdjfzva8cp4sj7l91-gzip-1.14-info' from 'https://cache.nixos.org'...
copying path '/nix/store/0i7mzq93m8p7253bxnh7ydahmjsjrabk-gzip-1.14-man' from 'https://cache.nixos.org'...
copying path '/nix/store/diprg8qwrk8zwx73bjnjzjvaccdq5z1g-hicolor-icon-theme-0.18' from 'https://cache.nixos.org'...
copying path '/nix/store/c7y25162xaplam12ysj17g5pwgs8vj99-hwdb.bin' from 'https://cache.nixos.org'...
copying path '/nix/store/j4gc8fk7wazgn2hqnh0m8b12xx6m1n75-iana-etc-20250108' from 'https://cache.nixos.org'...
copying path '/nix/store/dwv0wf3szv3ipgyyyrf1zxh4iqlckiip-inputrc' from 'https://cache.nixos.org'...
copying path '/nix/store/nvz2hs89yjb8znxf7zw2y1rl8g0zc24g-intel2200BGFirmware-3.1-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/8v0wnff8rpa64im6gkfwf702f0d13asb-iptables-1.8.11-man' from 'https://cache.nixos.org'...
copying path '/nix/store/ib4za959rmvhyvhfn0p6y25szq9agzvv-X-Restart-Triggers-systemd-networkd' from 'https://cache.nixos.org'...
copying path '/nix/store/7vswj657kcfyz8g0i5lgm17k28nw9b6q-keymap' from 'https://cache.nixos.org'...
copying path '/nix/store/1lcg48lg3yw873x21gybqzdmp06yqf0f-kmod-blacklist-31+20240202-2ubuntu8' from 'https://cache.nixos.org'...
copying path '/nix/store/m1arp7n5z5cqsv88l0gjazzfvkc8ia84-fontconfig-conf' from 'https://cache.nixos.org'...
copying path '/nix/store/q1f1r3hqs0h6gjkas71kzaafsnbipkp9-kmod-debian-aliases.conf-30+20230601-2' from 'https://cache.nixos.org'...
copying path '/nix/store/fanpm1fxx8x5wrizmddhqgqpxrw253bf-less-668-man' from 'https://cache.nixos.org'...
copying path '/nix/store/qlbfg75i4wz6sb2ipzh4n1k0p8gp4wjp-lessconfig' from 'https://cache.nixos.org'...
copying path '/nix/store/mhxn5kwnri3z9hdzi3x0980id65p0icn-lib.sh' from 'https://cache.nixos.org'...
copying path '/nix/store/fsbyh73wsjl7gfl2k4rvdc6y02ixljmk-libcap-2.75-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/5ja0hlyfnyvq1yyd2h8pzrmwwk9bgayy-libreelec-dvb-firmware-1.5.0-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/mdf936r0ahj70lqqc09147msz4yxi3hb-libressl-4.0.0-man' from 'https://cache.nixos.org'...
copying path '/nix/store/mddq2k6rmr77bz96j42y947wywcxin50-libcap-2.75-man' from 'https://cache.nixos.org'...
copying path '/nix/store/yypqcvqhnv8y4zpicgxdigp3giq81gzb-libunistring-1.3' from 'https://cache.nixos.org'...
copying path '/nix/store/ahfbv5byr6hiqfa2jl7pi4qh35ilvxzg-fontconfig-etc' from 'https://cache.nixos.org'...
copying path '/nix/store/6fmfvkxjq2q8hzvhmi5717i0zmwjkrpw-liburing-2.9' from 'https://cache.nixos.org'...
copying path '/nix/store/737acshv7jgp9jbg0cg9766m6izcwllh-link-units' from 'https://cache.nixos.org'...
copying path '/nix/store/303izw3zmxza3n01blxaa5a44abbqkkr-linux-6.12.30' from 'https://cache.nixos.org'...
copying path '/nix/store/8pncaz101prqwhvcrdfx0pbmv4ayq5bf-linux-firmware-20250509-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/hxrjrzngydk24ah8b5n8cl777n39y08b-linux-headers-6.12.7' from 'https://cache.nixos.org'...
copying path '/nix/store/c4inn6fkfc4flai72ym5470jp2va8b6c-linux-pam-1.6.1-man' from 'https://cache.nixos.org'...
copying path '/nix/store/x4a9ksmwqbhirjxn82cddvnhqlxfgw8l-linux-headers-static-6.12.7' from 'https://cache.nixos.org'...
copying path '/nix/store/hi41wm3spb6awigpdvkp1sqyj0gj67vf-linux-pam-1.6.1-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/m97qnhb417rmaiwwlw8qz2nvimgbmhxj-local-cmds' from 'https://cache.nixos.org'...
copying path '/nix/store/6yd58721msbknn6fs57w0j82v04vpzw6-locale.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/7kdkx4y7lbb15lb2qksw0nzal23mkhjy-login.defs' from 'https://cache.nixos.org'...
copying path '/nix/store/x4mjvy4h92qy7gzi3anp0xbsw9icn3qj-logrotate.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/li71ly6mmsc7m9rm1hl98m4ka508s52i-lvm2-2.03.31-man' from 'https://cache.nixos.org'...
copying path '/nix/store/1jj2lq1kzys105rqq5n1a2r4v59arz43-mailcap-2.1.54' from 'https://cache.nixos.org'...
copying path '/nix/store/qkvqycyhqc9g9vpyp446b5cx7hv1c5zi-man-db-2.13.0-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/gkbc6nv3h0hsp06kqk0p6s9911c2a1gg-mounts.sh' from 'https://cache.nixos.org'...
copying path '/nix/store/qdv28rq2xlj68lsgrar938dq38v2lh5b-multiuser.nix' from 'https://cache.nixos.org'...
copying path '/nix/store/6nkqdqzpa75514lhglgnjs5k4dklw4sb-libidn2-2.3.8' from 'https://cache.nixos.org'...
copying path '/nix/store/vqykgcs16rs0ny39wlqb2hihb19f5bc8-nano-8.4-info' from 'https://cache.nixos.org'...
copying path '/nix/store/6c69fcc0583xx7mqc4avszsv8dj1glfb-ncurses-6.5-man' from 'https://cache.nixos.org'...
copying path '/nix/store/smpby3mgssbggz941499y9x9r35w8cbh-nix-2.28.3-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/k9chrrif685hvkiqkc3fgfib19v2mh2y-nix-2.28.3-man' from 'https://cache.nixos.org'...
copying path '/nix/store/bik2ny1bj83jby10lvq912i9v5gzy8g3-nix-bash-completions-0.6.8' from 'https://cache.nixos.org'...
copying path '/nix/store/90asb028hphm9iqh2h0xk3c52j3117rf-nix-zsh-completions-0.5.1' from 'https://cache.nixos.org'...
copying path '/nix/store/lwhcdpa73h0p6z2hc8f5mqx6x03widq4-nixos-configuration-reference-manpage' from 'https://cache.nixos.org'...
copying path '/nix/store/0249ff3p72ggrd308l2yk9n700f95kir-nixos-manual-html' from 'https://cache.nixos.org'...
copying path '/nix/store/z8dgwwnab96n86v0fnr37mn107w26s1f-nixos-manual.desktop' from 'https://cache.nixos.org'...
copying path '/nix/store/gj6hz9mj23v01yvq1nn5f655jrcky1qq-nixos-option.nix' from 'https://cache.nixos.org'...
copying path '/nix/store/6fv8ayzjvgyl3rdhxp924zdhwvhz2iq6-nss-cacert-3.111' from 'https://cache.nixos.org'...
copying path '/nix/store/l7rjijvn6vx8njaf95vviw5krn3i9nnx-nss-cacert-3.111-p11kit' from 'https://cache.nixos.org'...
copying path '/nix/store/as6v2kmhaz3syhilzzi25p9mn0zi9y0b-other.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/d6kfv0rb15n92pi1jsjk65nd9264wja6-perl-5.40.0-man' from 'https://cache.nixos.org'...
copying path '/nix/store/v0r2ndk31k1lsj967qrywdwxb87zdil6-perl5.40.0-Digest-HMAC-1.04' from 'https://cache.nixos.org'...
copying path '/nix/store/l6b79dzj572yjifnwnrmjmf2r8qx1542-perl5.40.0-Encode-Locale-1.05' from 'https://cache.nixos.org'...
copying path '/nix/store/mri94g6brszrzi5spdp3yjqig0dix246-perl5.40.0-FCGI-ProcManager-0.28' from 'https://cache.nixos.org'...
copying path '/nix/store/vxmnihhgnkyd2yh1y6gsyrw7lzqyh0sn-perl5.40.0-File-Slurp-9999.32' from 'https://cache.nixos.org'...
copying path '/nix/store/jvy29fslpki9ygmipnawxkacs0gdpwbg-perl5.40.0-HTML-TagCloud-0.38' from 'https://cache.nixos.org'...
copying path '/nix/store/187sf67ng5l08pirjv1hcnvvsx6bg6vi-perl5.40.0-Authen-SASL-2.1700' from 'https://cache.nixos.org'...
copying path '/nix/store/q6gp62h0h2z2lx3qh318crhikwc86m2y-perl5.40.0-HTML-Tagset-3.20' from 'https://cache.nixos.org'...
copying path '/nix/store/1f3pkwqxmhglz59hdl9mizgaafrcxr2g-perl5.40.0-IO-HTML-1.004' from 'https://cache.nixos.org'...
copying path '/nix/store/6insghd7kklnnilycdmbwl71l1gi9nkb-perl5.40.0-IO-Stringy-2.113' from 'https://cache.nixos.org'...
copying path '/nix/store/cqa81jdkhwvkjnz810laxhd6faw8q917-perl5.40.0-JSON-4.10' from 'https://cache.nixos.org'...
copying path '/nix/store/gvjb0301bm7lc20cbbp6q4mznb3k09j3-perl5.40.0-LWP-MediaTypes-6.04' from 'https://cache.nixos.org'...
copying path '/nix/store/3fvhcxjgn3a4r6pkidwz9nd4cs84p6jv-perl5.40.0-Mozilla-CA-20230821' from 'https://cache.nixos.org'...
copying path '/nix/store/pimqpkya3wybrpcm17zk298gpivhps5j-perl5.40.0-Test-Needs-0.002010' from 'https://cache.nixos.org'...
copying path '/nix/store/wq3ij7g3r6jfkx61d3nbxrfmyw3f3bng-perl5.40.0-Test-RequiresInternet-0.05' from 'https://cache.nixos.org'...
copying path '/nix/store/j5agsmr85pb3waxmzxn2m79yb1i7hhmh-perl5.40.0-TimeDate-2.33' from 'https://cache.nixos.org'...
copying path '/nix/store/lyr4v74c0vw9j77fvr0d6dribm1lmfsr-perl5.40.0-Try-Tiny-0.31' from 'https://cache.nixos.org'...
copying path '/nix/store/1hb2dxywm239rfwgdrd55z090hb1zbg3-perl5.40.0-URI-5.21' from 'https://cache.nixos.org'...
copying path '/nix/store/5vc5pjg9yqxkxk855il2anp6jm5gkpa3-perl5.40.0-libnet-3.15' from 'https://cache.nixos.org'...
copying path '/nix/store/kqz08h7qzxq13n3r3dymsl3jafgxl60x-php.ini' from 'https://cache.nixos.org'...
copying path '/nix/store/97qlbk0b8y0xs2hpjs37rp3sq6bdh99w-perl5.40.0-Config-IniFiles-3.000003' from 'https://cache.nixos.org'...
copying path '/nix/store/nfwlyasnxxdbnpiziw2nixwkz9b5f7g3-publicsuffix-list-0-unstable-2025-03-12' from 'https://cache.nixos.org'...
copying path '/nix/store/30qhz45nwgfyns13ijq0nwrsjp8m7ypa-relaxedsandbox.nix' from 'https://cache.nixos.org'...
copying path '/nix/store/5h63p4i2p25ba728pi4fr6vdcxa1227j-rt5677-firmware-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/w5f538hq8zh4cxpjs3lx4jdhr2p6wvq8-rtl8192su-unstable-2016-10-05-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/xnfzahna7b6jb6m1vdczap4v103qmr6w-perl5.40.0-Test-Fatal-0.017' from 'https://cache.nixos.org'...
copying path '/nix/store/rb472zb1d7245j44iwm7xsnn9xkhv28r-rtl8761b-firmware-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/blgz4vzk56rbajaavr6kg437zr7jcabp-perl5.40.0-HTTP-Date-6.06' from 'https://cache.nixos.org'...
copying path '/nix/store/3zf0hlfxwam2pcpr28374plf3zwcbkr0-perl5.40.0-Net-HTTP-6.23' from 'https://cache.nixos.org'...
copying path '/nix/store/5i759vgj25fdy680l9v0sjhjg65q0q4h-perl5.40.0-WWW-RobotRules-6.02' from 'https://cache.nixos.org'...
copying path '/nix/store/zqzjsf740jc5jqrzidw7qzkrsrl95d2b-rxvt-unicode-unwrapped-9.31-terminfo' from 'https://cache.nixos.org'...
copying path '/nix/store/6602zq9jmd3r4772ajw866nkzn6gk1j0-sandbox.nix' from 'https://cache.nixos.org'...
copying path '/nix/store/rg5rf512szdxmnj9qal3wfdnpfsx38qi-setup-etc.pl' from 'https://cache.nixos.org'...
copying path '/nix/store/kw5mdls5m8iqzh620iwm6h42rjqcbj93-shadow-4.17.4-man' from 'https://cache.nixos.org'...
copying path '/nix/store/mvaibwlc8b5gfj13b3za7g5408hgjgwn-sof-firmware-2025.01.1-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/x51649mj5ppmj97qrgxwr0calf82m9a5-perl5.40.0-File-Listing-6.16' from 'https://cache.nixos.org'...
copying path '/nix/store/krfhnl4n5a9w201z5pzwgps9fgz8z5j5-perl5.40.0-HTTP-CookieJar-0.014' from 'https://cache.nixos.org'...
copying path '/nix/store/wcf11ld95pf7h1sn6nglgmrizbjlcw2f-sound-theme-freedesktop-0.8' from 'https://cache.nixos.org'...
copying path '/nix/store/2g8wdl6qgkpk9dhj9rir7zkf9nxnjqzw-source' from 'https://cache.nixos.org'...
copying path '/nix/store/ids7wg1swihwhh17qbdbpmbdx67k5w21-ssh-root-provision.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/2959xcdddldhls7wslkm7gv2xf5pki1x-strace-6.14-man' from 'https://cache.nixos.org'...
copying path '/nix/store/pilsssjjdxvdphlg2h19p0bfx5q0jzkn-strip.sh' from 'https://cache.nixos.org'...
copying path '/nix/store/p7r0byvn43583rx7rvvy2pj44yv5c1jj-stub-ld-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/v9ibkbvwc03ni062gh3ml4s0mswq0zfs-sudoers' from 'https://cache.nixos.org'...
copying path '/nix/store/xx25wf50ww3bci4dvhfj2mrgccdfinja-system-generators' from 'https://cache.nixos.org'...
copying path '/nix/store/k5k5dvfz26a0py2xljmhz9a08y42gkkv-system-shutdown' from 'https://cache.nixos.org'...
copying path '/nix/store/0iyxf2pmg0i16d4kxarqdfd3nqfa9mc5-systemd-257.5-man' from 'https://cache.nixos.org'...
copying path '/nix/store/qyihkwbhd70ynz380whj3bsxk1d2lyc4-tzdata-2025b' from 'https://cache.nixos.org'...
copying path '/nix/store/q7pmljnijxmihsz0lsn90b7l2yvncvwm-udev-rules' from 'https://cache.nixos.org'...
copying path '/nix/store/qzv2iqy6b9jl7x76pfcplqb81gs8sarx-unit--.slice' from 'https://cache.nixos.org'...
copying path '/nix/store/5wap65qygkmwxnaykp2k00xbip0203ah-unit-dbus.socket' from 'https://cache.nixos.org'...
copying path '/nix/store/djhz08ld7cqvi36v4by31mr560lbbgdy-unit-fs.target' from 'https://cache.nixos.org'...
copying path '/nix/store/yyjy3ni8amh8lmpgikv6qps1ygphhg9h-unit-fstrim.timer' from 'https://cache.nixos.org'...
copying path '/nix/store/68ymaa7yqz8b9c3m86awp9qrs3z5gmb9-unit-keys.target' from 'https://cache.nixos.org'...
copying path '/nix/store/3mpivh2pqa1bbyp8h3n2wk8s0fvhp2rg-unit-local-fs.target' from 'https://cache.nixos.org'...
copying path '/nix/store/3i90ba6lh4d8jd58kqgznxr53kzha657-unit-logrotate.timer' from 'https://cache.nixos.org'...
copying path '/nix/store/65pm1jd651q5891y7171sl2nsvnmh1a2-unit-multi-user.target' from 'https://cache.nixos.org'...
copying path '/nix/store/m2chlkrf4dhjcnq50x6qnjlfvhz9c60s-unit-network-local-commands.service-disabled' from 'https://cache.nixos.org'...
copying path '/nix/store/c1b80rjkrfis8704c9xxwl8chg0kpxd2-unit-nix-daemon.socket' from 'https://cache.nixos.org'...
copying path '/nix/store/fl6il46drw769y6z9h4b89yv1k55xps3-unit-nixos-fake-graphical-session.target' from 'https://cache.nixos.org'...
copying path '/nix/store/n4cwpsbmd30nhps87yic15rnxfvnlvaw-unit-phpfpm.target' from 'https://cache.nixos.org'...
copying path '/nix/store/8zmflchf01g3wlj9j6csfnd47j0lgzcg-unit-post-resume.target' from 'https://cache.nixos.org'...
copying path '/nix/store/842zkhkx2aa0zy94qws3346dnd1cm3h6-unit-remote-fs.target' from 'https://cache.nixos.org'...
copying path '/nix/store/9gkhxinv1884d1vy74rnkjd9vj2zn89p-unit-run-initramfs.mount' from 'https://cache.nixos.org'...
copying path '/nix/store/4aiwrxc5i77s856dgx6b7yvqnxbq8x0g-unit-run-wrappers.mount' from 'https://cache.nixos.org'...
copying path '/nix/store/gyxhzj5v8k01vwva1s476ny2zll2nvzm-unit-sysinit-reactivation.target' from 'https://cache.nixos.org'...
copying path '/nix/store/p1k14mysynvbwyclk1nfjyyvcnrv65bp-unit-system-phpfpm.slice' from 'https://cache.nixos.org'...
copying path '/nix/store/10wi26kk0cjrifnvdsyrl8w4987z4hsb-unit-system.slice' from 'https://cache.nixos.org'...
copying path '/nix/store/8g5vq29riss8693g7syg8n0bj2d7vc9l-unit-systemd-journald-audit.socket' from 'https://cache.nixos.org'...
copying path '/nix/store/g29nsjbhdlc1xzgl0a0cybqvy9mg895l-unit-systemd-networkd.socket' from 'https://cache.nixos.org'...
copying path '/nix/store/5wcg3gl5qzna3qn53id02sghbzfqa67z-unit-user-.slice' from 'https://cache.nixos.org'...
copying path '/nix/store/7sb1nkpf82nb5kj7qc4bbqkwj1l1mdv9-update-users-groups.pl' from 'https://cache.nixos.org'...
copying path '/nix/store/l19w3rl6k8767i9znna0rfkjvl5cz4kg-urxvt-autocomplete-all-the-things-1.6.0' from 'https://cache.nixos.org'...
copying path '/nix/store/ym3cf4rnxblhlpsxj2cd5wm8rp8pgfr7-urxvt-perls-2.3' from 'https://cache.nixos.org'...
copying path '/nix/store/na57lsanf2c453zdz1508wnzvbh9w4rg-urxvt-resize-font-2019-10-05' from 'https://cache.nixos.org'...
copying path '/nix/store/l8qxbnizariir6sncianl8q0i4a0zaya-urxvt-tabbedex-19.21' from 'https://cache.nixos.org'...
copying path '/nix/store/d8k53n8mmb8j1a6v4f3wvhhap8xwcssd-urxvt-theme-switch-unstable-2014-12-21' from 'https://cache.nixos.org'...
copying path '/nix/store/3fgvp4zddvbkkyviq5sajbl7wc7lmx5q-user-generators' from 'https://cache.nixos.org'...
copying path '/nix/store/k7bynf83k39pk9x6012vjrd6fll2wdqh-useradd' from 'https://cache.nixos.org'...
copying path '/nix/store/wcrrwx3yvbvwa1hryjpgcbysdf8glnix-util-linux-2.41-man' from 'https://cache.nixos.org'...
copying path '/nix/store/vq0m8mcigxkjfjdwrgzvizjam5vx669h-wireless-regdb-2025.02.20-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/rqxaqpliqlygv3hw53j4j7s54qj5hjri-vconsole.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/za53jjhjl1xajv3y1zpjvr9mh4w0c1ay-xgcc-14.2.1.20250322-libgcc' from 'https://cache.nixos.org'...
copying path '/nix/store/18w6fpxmn5px02bpfgk702bs9k7yj5ml-xorgproto-2024.1' from 'https://cache.nixos.org'...
copying path '/nix/store/l63r9kidyd8siydvr485g71fsql8s48b-xz-5.8.1-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/3drnnkrsdfrqdrdg425wda83k79nlmwp-xz-5.8.1-man' from 'https://cache.nixos.org'...
copying path '/nix/store/lxbkfad3nbyfx3hsc1ajlvqs3s67li6x-zd1211-firmware-1.5-zstd' from 'https://cache.nixos.org'...
copying path '/nix/store/wysmwjrvwx7gk5w6dxd0d1jwjbjj350a-zsh-5.9-doc' from 'https://cache.nixos.org'...
copying path '/nix/store/4y2v97rjk4mic266vzbvmlxjnjnisnmm-zsh-5.9-info' from 'https://cache.nixos.org'...
copying path '/nix/store/af0jrnzsydq1i28vcnkpgp0110ac2cj3-zsh-5.9-man' from 'https://cache.nixos.org'...
copying path '/nix/store/184bcjcc97x3klsz63fy29ghznrzkipg-zstd-1.5.7-man' from 'https://cache.nixos.org'...
copying path '/nix/store/cg9s562sa33k78m63njfn1rw47dp9z0i-glibc-2.40-66' from 'https://cache.nixos.org'...
copying path '/nix/store/8syylmkvnn7lg2nar9fddpp5izb4gh56-attr-2.5.2' from 'https://cache.nixos.org'...
copying path '/nix/store/a6w0pard602b6j7508j5m95l8ji0qvn6-aws-c-common-0.10.3' from 'https://cache.nixos.org'...
copying path '/nix/store/xy4jjgw87sbgwylm5kn047d9gkbhsr9x-bash-5.2p37' from 'https://cache.nixos.org'...
copying path '/nix/store/7a8gf62bfl22k4gy2cd300h7cvqmn9yl-brotli-1.1.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/6ycmjimp1h3z4xgf47jjxxmps9skbdw1-cpio-2.15' from 'https://cache.nixos.org'...
copying path '/nix/store/w762xfdg6qkyamncs8s33m182n45nmma-dav1d-1.5.1' from 'https://cache.nixos.org'...
copying path '/nix/store/pyfpxwjw1a7fj5j7n2czlk4g7lvzhvhy-dosfstools-4.2' from 'https://cache.nixos.org'...
copying path '/nix/store/2x51wvk10m9l014lyrfdskc3b360ifjp-ed-1.21.1' from 'https://cache.nixos.org'...
copying path '/nix/store/p9k7bd23v5yvmap9594f9x7hpvacdh32-expand-response-params' from 'https://cache.nixos.org'...
copying path '/nix/store/j0bzxly2rvcym1zkhn393adiqcwn8np6-expat-2.7.1' from 'https://cache.nixos.org'...
copying path '/nix/store/719j8zd8g3pa5605b7a6w5csln323b1x-fribidi-1.0.16' from 'https://cache.nixos.org'...
copying path '/nix/store/qlwqqqjdvska6nyjn91l9gkxjjw80a97-editline-1.17.1' from 'https://cache.nixos.org'...
copying path '/nix/store/zrnqzhcvlpiycqbswl0w172y4bpn0lb4-bzip2-1.0.8' from 'https://cache.nixos.org'...
copying path '/nix/store/fcyn0dqszgfysiasdmkv1jh3syncajay-gawk-5.3.2' from 'https://cache.nixos.org'...
copying path '/nix/store/7c0v0kbrrdc2cqgisi78jdqxn73n3401-gcc-14.2.1.20250322-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/qkzkz12l4q06lzbji0ifgynzrd44bpjs-gdbm-1.25-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/d9x4blp2xwsbamz8az3c54x7la08j6ln-giflib-5.2.2' from 'https://cache.nixos.org'...
copying path '/nix/store/1abbyfv3bpxalfjfgpmwg8jcy931bf76-bzip2-1.0.8-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/303islqk386z1w2g1ngvxnkl4glfpgrs-glibc-2.40-66-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/3mi59bgj22xx29dyss7jhmx3sgznd85m-acl-2.3.2' from 'https://cache.nixos.org'...
copying path '/nix/store/zhpgx7kcf8ii2awhk1lz6p565vv27jv5-attr-2.5.2-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/w4hr24l1bfj07b56vm3zrp0rzxsd3537-aws-c-compression-0.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/ifvslnvmvg3nb26yliprya6ja1kb5yaf-aws-c-sdkutils-0.2.1' from 'https://cache.nixos.org'...
copying path '/nix/store/26ddah1lva210rn57dzkan1dgjvj7dn4-aws-checksums-0.2.2' from 'https://cache.nixos.org'...
copying path '/nix/store/if83fp73ln7ksdnp1wkywvyv53b6fw3f-glibc-2.40-66-getent' from 'https://cache.nixos.org'...
copying path '/nix/store/dwwc14ppzkl0yphcgsz25xvi24c9d1zm-gmp-6.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/c341wfmk7r827k691yp5ynjnv5014xqf-audit-disable' from 'https://cache.nixos.org'...
copying path '/nix/store/rjlwg1dlbhkv2bhrq03m794xbhcwcgh6-audit-stop' from 'https://cache.nixos.org'...
copying path '/nix/store/1191qk37q1bxyj43j0y1l534jvsckyma-acl-2.3.2-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/padpqlhkvnr56a5j4ma5mlfrp46ibg7g-container-init' from 'https://cache.nixos.org'...
copying path '/nix/store/y7g9g1gfg1f6y3gm2h02i7hmjzv10f9q-dav1d-1.5.1-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/24gnm4vyck53sppsvlzcmknvz7jp8x0p-firewall-start' from 'https://cache.nixos.org'...
copying path '/nix/store/y7ljc4ir2hkwkr7lhgm9xj5hw3kw8275-firewall-stop' from 'https://cache.nixos.org'...
copying path '/nix/store/cab2yvnph1hfym998vdq0q4nr9zfndrs-gnum4-1.4.19' from 'https://cache.nixos.org'...
copying path '/nix/store/j2v7jjnczkj7ra7jsgq6kv3242a1l52x-getent-glibc-2.40-66' from 'https://cache.nixos.org'...
copying path '/nix/store/clbb2cvigynr235ab5zgi18dyavznlk2-gnused-4.9' from 'https://cache.nixos.org'...
copying path '/nix/store/wrxvqj822kz8746608lgns7h8mkpn79f-gnutar-1.35' from 'https://cache.nixos.org'...
copying path '/nix/store/pl3wb7v54542kdaj79dms8r2caqbn0nv-gpm-unstable-2020-06-17' from 'https://cache.nixos.org'...
copying path '/nix/store/afhkqb5a94zlwjxigsnwsfwkf38h21dk-gzip-1.14' from 'https://cache.nixos.org'...
copying path '/nix/store/677sx4qrmnmgk83ynn0sw8hqgh439g6b-json-c-0.18' from 'https://cache.nixos.org'...
copying path '/nix/store/4v64wga9rk0c919ip673j36g6ikx26ha-keyutils-1.6.3-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/bkm4ppw3rpyndsvy5r18fjpngg2730ip-libICE-1.1.2' from 'https://cache.nixos.org'...
copying path '/nix/store/psjc7gv2314bxncywpvsg76gvbk2dn00-libXau-1.0.12' from 'https://cache.nixos.org'...
copying path '/nix/store/aq5b44b37zp5dfwz5330pxqm699gs4g3-isl-0.20' from 'https://cache.nixos.org'...
copying path '/nix/store/hx0kbryivbs7qccnvpmr17y6x818dhxc-libXdmcp-1.1.5' from 'https://cache.nixos.org'...
copying path '/nix/store/mhhia7plis47fhrv713fmjibqal96w1g-libaio-0.3.113' from 'https://cache.nixos.org'...
copying path '/nix/store/1rlljm73ch98b2q9qqk8g0vhv2n9mya8-libapparmor-4.1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/qsyxh2zqqkqzaaa0v5scpjz364ksmj3m-libargon2-20190702' from 'https://cache.nixos.org'...
copying path '/nix/store/r25srliigrrv5q3n7y8ms6z10spvjcd9-glibc-2.40-66-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/wcjq2bl1vhvnc07xzl5m41jncf745yz4-firewall-reload' from 'https://cache.nixos.org'...
copying path '/nix/store/bh1hxs692a2fv806wkiprig10j5znd7c-libcap-2.75-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/142lbjxi74mv9nkb9k4831v2x8v5w5zv-bison-3.8.2' from 'https://cache.nixos.org'...
copying path '/nix/store/nsi5mszs52rj3hgkpa8cnc90nnqvl11a-boehm-gc-8.2.8' from 'https://cache.nixos.org'...
copying path '/nix/store/z98iwn19jjspfha4adjkp32r5nj56grw-bootspec-1.0.0' from 'https://cache.nixos.org'...
copying path '/nix/store/x9hwyp3ld0mdqs8jcghshihwjdxm114l-boehm-gc-8.2.8' from 'https://cache.nixos.org'...
copying path '/nix/store/fm2ky0fkkkici6zpf2s41c1lvkcpfbm5-db-4.8.30' from 'https://cache.nixos.org'...
copying path '/nix/store/10glq3a1jbsxv50yvcw1kxxz06vq856w-db-5.3.28' from 'https://cache.nixos.org'...
copying path '/nix/store/wzwlizg15dwh6x0h3ckjmibdblfkfdzf-flex-2.6.4' from 'https://cache.nixos.org'...
copying path '/nix/store/sdqvwr8gc74ms9cgf56yvy409xvl8hsf-gettext-0.22.5' from 'https://cache.nixos.org'...
copying path '/nix/store/kxhsmlrscry4pvbpwkbbbxsksmzg0gp0-gmp-with-cxx-6.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/nzg6zqsijbv7yc95wlfcdswx6bg69srq-gmp-with-cxx-6.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/088li1j480s9yv1736wiz7a26bxi405w-graphite2-1.3.14' from 'https://cache.nixos.org'...
copying path '/nix/store/s86p50hcjcp9phyv9gxd5hra8nwczvrk-groff-1.23.0' from 'https://cache.nixos.org'...
copying path '/nix/store/x4b392vjjza0kz7wxbhpji3fi8v9hr86-gtest-1.16.0' from 'https://cache.nixos.org'...
copying path '/nix/store/rw826fx75sw7jywfvay6z5a6cnj74l1g-icu4c-73.2' from 'https://cache.nixos.org'...
copying path '/nix/store/9hpylx077slqmzb5pz8818mxjws3appp-iputils-20240905' from 'https://cache.nixos.org'...
copying path '/nix/store/y4ygj0jgwmz5y8n7jg4cxgxv4lc1pwfy-jemalloc-5.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/jgxvk139zdfxi1wgdi9pjj1yhhgwvrff-lerc-4.0.0' from 'https://cache.nixos.org'...
copying path '/nix/store/ckwwqi6p7x3w64qdhx14avy2vf8a4wiq-libICE-1.1.2-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/x2wlg9cm3yrinz290r4v2fxpbpkw8gki-libcap-2.75' from 'https://cache.nixos.org'...
copying path '/nix/store/2bjcjfzxnwk3zjhkrxi3m762p8dv6f1s-libcap-ng-0.8.5' from 'https://cache.nixos.org'...
copying path '/nix/store/87fck6hm17chxjq7badb11mq036zbyv9-coreutils-9.7' from 'https://cache.nixos.org'...
copying path '/nix/store/dfznrcrr2raj9x4bdysvs896jfnx84ih-libcbor-0.12.0' from 'https://cache.nixos.org'...
copying path '/nix/store/jrd3xs0yvb2xssfqn38rfxhnzxz9827s-libcpuid-0.7.1' from 'https://cache.nixos.org'...
copying path '/nix/store/w53vh0qqs6l2xm4saglkxaj97gi50nr5-libdatrie-2019-12-20-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/amy9kqbm05wv18z5z66a3kprc2ccp390-libdeflate-1.23' from 'https://cache.nixos.org'...
copying path '/nix/store/yai7mpy5d4rw0jvflyxdf0vzjkiqxhv6-libevent-2.1.12' from 'https://cache.nixos.org'...
copying path '/nix/store/90c412b9wqhfny300rg5s2gpsbrqb31q-libffi-3.4.8' from 'https://cache.nixos.org'...
copying path '/nix/store/9z7wv6k9i38k83xpbgqcapaxhdkbaqhz-libgpg-error-1.51' from 'https://cache.nixos.org'...
copying path '/nix/store/vwj8664lvyx3svjp856baijyk17vv9lc-libidn-1.42' from 'https://cache.nixos.org'...
copying path '/nix/store/9f6bvnw1hxy79shw6lva854ck3cmi43j-libjpeg-turbo-3.0.4' from 'https://cache.nixos.org'...
copying path '/nix/store/56fi3kcbg9haxf5c1innrn2p9dx2da2j-libmd-1.1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/9hbdbr5hikxjb16ir40w2v24gbivv22x-libmnl-1.0.5' from 'https://cache.nixos.org'...
copying path '/nix/store/ygz5dcpzd7qkw44wpbd65rl6amwpxp5f-libnfnetlink-1.0.2' from 'https://cache.nixos.org'...
copying path '/nix/store/635dz3p1afjwym9snp2r9hm0vaznwngy-libnl-3.11.0' from 'https://cache.nixos.org'...
copying path '/nix/store/59j7x0s1zybrjhnq5cv1ksm0di4zyb4n-libpipeline-1.5.8' from 'https://cache.nixos.org'...
copying path '/nix/store/2sbq4hd9imczmbb5za1awq0gvg0cbrwr-libbsd-0.12.2' from 'https://cache.nixos.org'...
copying path '/nix/store/bxs5j3zhh35nwhyhwc3db724c7nzfl36-libpsl-0.21.5' from 'https://cache.nixos.org'...
copying path '/nix/store/q0dsazc8234b7imr9y4vv5rv09r58mqi-libptytty-2.0' from 'https://cache.nixos.org'...
copying path '/nix/store/f7y5q4jwja2z3i5zlylgbv5av6839a54-libnftnl-1.2.9' from 'https://cache.nixos.org'...
copying path '/nix/store/6wrjb93m2arv7adx6k2x9nlb0y7rmgpi-libnetfilter_conntrack-1.1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/kvycshxci0x434bcgnsvr9c0qgmsw6v5-libressl-4.0.0' from 'https://cache.nixos.org'...
copying path '/nix/store/a7zbljj0cwkbfzn22v6s2cbh39dj9hip-libseccomp-2.6.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/7h0sard22wnbz0jyz07w8y9y0fcs795r-diffutils-3.12' from 'https://cache.nixos.org'...
copying path '/nix/store/1wp5qqj9n3ccjvlbhpdlg9pp9dpc00ns-copy-extra-files' from 'https://cache.nixos.org'...
copying path '/nix/store/7y59hzi3svdj1xjddjn2k7km96pifcyl-findutils-4.10.0' from 'https://cache.nixos.org'...
copying path '/nix/store/rmrbzp98xrk54pdlm7cxhayj4344zw6h-libassuan-3.0.2' from 'https://cache.nixos.org'...
copying path '/nix/store/0dqmgjr0jsc2s75sbgdvkk7d08zx5g61-libgcrypt-1.10.3-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/9gzvhlrpxmkhggn32q7q9r38cfg6gasn-libsodium-1.0.20' from 'https://cache.nixos.org'...
copying path '/nix/store/zf61wng66ik05clni78571wfmfp5kqzq-libtasn1-4.20.0' from 'https://cache.nixos.org'...
copying path '/nix/store/z53ai32niqhghbqschnlvii5pmgg2gcx-libthai-0.1.29' from 'https://cache.nixos.org'...
copying path '/nix/store/np37flx1k0dj0j0xgxzkxs069sb5h4k3-libtool-2.5.4-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/1warn5bb3r7jwfkpdgr4npab3s63sivj-liburcu-0.15.2' from 'https://cache.nixos.org'...
copying path '/nix/store/9mcjnb75xq17mvr8ikm3sg5yhx6ga62r-libuv-1.50.0' from 'https://cache.nixos.org'...
copying path '/nix/store/sh1rrkag3x04p0gs80723iwfdwlysxf8-libvmaf-3.0.0' from 'https://cache.nixos.org'...
copying path '/nix/store/qn01pv62sbpzbsy0a6m0q23syrmkk3bv-libxcb-1.17.0' from 'https://cache.nixos.org'...
copying path '/nix/store/qizipyz9y17nr4w4gmxvwd3x4k0bp2rh-libxcrypt-4.4.38' from 'https://cache.nixos.org'...
copying path '/nix/store/1r4qwdkxwc1r3n0bij0sq9q4nvfraw6i-libpcap-1.10.5' from 'https://cache.nixos.org'...
copying path '/nix/store/xv0pc5nc41v5vi0lac1i2d353s3rqlkm-libxml2-2.13.8' from 'https://cache.nixos.org'...
copying path '/nix/store/39zbg3zrp77ima6ih51ihzlzmm1yj5vh-libyuv-1908' from 'https://cache.nixos.org'...
copying path '/nix/store/3mqzj6ndzyy2v86xm70d5hdd1nsl1y9f-lm-sensors-3.6.0' from 'https://cache.nixos.org'...
copying path '/nix/store/g3j7jsv3nsfnxkq98asi01n0fink0dk9-llhttp-9.2.1' from 'https://cache.nixos.org'...
copying path '/nix/store/iyh7nfcs7f249fzrbavqgxzwiy0z7xii-lowdown-1.3.2-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/51sr6m5fb8fff9vydnz7gkqyl5sjpixl-lz4-1.10.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/gpbn3j498s0909h5j8fb3h4is8dn8rll-lzo-2.10' from 'https://cache.nixos.org'...
copying path '/nix/store/zfb1cj0swnadhvfjvp0jm2zhgwiy927f-make-initrd-ng-0.1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/h4zr885cac368xv73qrhscbpc7irqly8-mcpp-2.7.2.1' from 'https://cache.nixos.org'...
copying path '/nix/store/g2gsgbka17hdr999v8k9yhkq825mb6zz-mkpasswd-5.6.0' from 'https://cache.nixos.org'...
copying path '/nix/store/mpvxc1dbpnk74345lk69dw497iqcjvj0-libX11-1.8.12' from 'https://cache.nixos.org'...
copying path '/nix/store/9nn8vbf2n55zkb7dh6ldxckbis3pkh30-libaom-3.11.0' from 'https://cache.nixos.org'...
copying path '/nix/store/3ccwi70k69wrxq6nxy6v3iwwvawgsw6m-libressl-4.0.0-nc' from 'https://cache.nixos.org'...
copying path '/nix/store/029cprg174i7c4gvn1lwmnm4vdl6k8df-libvmaf-3.0.0-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/wxkbp7kwvpxvjh28rigmf6lfq64zlsyj-iptables-1.8.11' from 'https://cache.nixos.org'...
copying path '/nix/store/2l8jg5lpi7084sc1q33jmpd7fph41n2g-libxcb-1.17.0-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/wyf93cvh25b2xg82mkjcpmwgcspk0ggr-mpdecimal-4.0.0' from 'https://cache.nixos.org'...
copying path '/nix/store/n52k1dccv0mipz1s4gkk45x64cmmcvrf-mpfr-4.2.2' from 'https://cache.nixos.org'...
copying path '/nix/store/iszvcck61smiji8gxmbf02z3gi8zr7i3-mtools-4.0.48' from 'https://cache.nixos.org'...
copying path '/nix/store/74is8yi7sy8q58xg806fy0ja99fswjva-libxslt-1.1.43' from 'https://cache.nixos.org'...
copying path '/nix/store/mhmg8c5dmx8qi63rlz347931br8bmq08-ncompress-5.0' from 'https://cache.nixos.org'...
copying path '/nix/store/vfmnmqsnfiiqmphy7ffh2zqynsxfck1q-ncurses-6.5' from 'https://cache.nixos.org'...
copying path '/nix/store/skd9hg5cdz7jwpq1wp38fvzab9y8p0m6-net-tools-2.10' from 'https://cache.nixos.org'...
copying path '/nix/store/m4yrdwg3zv50mw8hy2zni5dyy7ljlg7j-nettle-3.10.1' from 'https://cache.nixos.org'...
copying path '/nix/store/v7rzgm8p6p0ghg5mqcin4vbx6pcrvc0j-nghttp2-1.65.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/hbjkfqhx0anx8x2rs0d9kbfhy80jfc7n-nixos-build-vms' from 'https://cache.nixos.org'...
copying path '/nix/store/833wqy1r0qpp5h5vd4yiqm5f2rjjc7jg-node_exporter-1.9.1' from 'https://cache.nixos.org'...
copying path '/nix/store/ci5nyvrii461hnaw267c1zvna0sjfxif-npth-1.8' from 'https://cache.nixos.org'...
copying path '/nix/store/6czlz4s2n2lsvn6xqlfw59swc0z21n89-nsncd-1.5.1' from 'https://cache.nixos.org'...
copying path '/nix/store/82n465240j5a8ap7c60gqy3a6kwqv1rs-numactl-2.0.18' from 'https://cache.nixos.org'...
copying path '/nix/store/z8rlklqfzxq7azbzyp30938x7wh5zf3c-oniguruma-6.9.10-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/mb407pssv7zc7pfb4d910k6fshfagm6j-libmpc-1.3.1' from 'https://cache.nixos.org'...
copying path '/nix/store/zllk6n33p6mx8y9cf4vhs2brcbis3na4-libX11-1.8.12-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/gmirqf6vp6rskn2dhfyd7haphy6kjnvk-libXext-1.3.6' from 'https://cache.nixos.org'...
copying path '/nix/store/21aj13sj7jg5ld96s3q7nd40s1iwzfld-libXfixes-6.0.1' from 'https://cache.nixos.org'...
copying path '/nix/store/a5mrmf5bjmjfq2y90hsn8xnw3lb0cqil-libXpm-3.5.17' from 'https://cache.nixos.org'...
copying path '/nix/store/md8kapandyhs7bbw5s782aanw38p2kax-gnupg-2.4.7' from 'https://cache.nixos.org'...
copying path '/nix/store/pbg3xkihyscyx3978z0pfc0xixb10pf6-libXrender-0.9.12' from 'https://cache.nixos.org'...
copying path '/nix/store/9m6a4iv2nh6v4aga830r499s4arknsfb-p11-kit-0.25.5' from 'https://cache.nixos.org'...
copying path '/nix/store/8pviily4fgsl02ijm65binz236717wfs-openssl-3.4.1' from 'https://cache.nixos.org'...
copying path '/nix/store/4sfqx63v2k8argz2wmnbspr0rh49y1c1-libXi-1.8.2' from 'https://cache.nixos.org'...
copying path '/nix/store/x0kaspzb5jqvgp357bj27z6iq24ximfg-patch-2.7.6' from 'https://cache.nixos.org'...
copying path '/nix/store/ifbr2frwmyf8p0a260hn5vzg3cagww14-pcre-8.45' from 'https://cache.nixos.org'...
copying path '/nix/store/a9c6rz5183psp30q1nhkakis6ab4km4b-pcre2-10.44' from 'https://cache.nixos.org'...
copying path '/nix/store/pkxrqwd26nqr7gh9d4gi9wf7hj6rk29a-libXcursor-1.2.3' from 'https://cache.nixos.org'...
copying path '/nix/store/sdqzcyfy52y0vf10nfsxy3mhv9b2vmkv-jq-1.7.1' from 'https://cache.nixos.org'...
copying path '/nix/store/wlzslync0dv270mvi1f7c0s1hf4p27yf-pcre2-10.44' from 'https://cache.nixos.org'...
copying path '/nix/store/qqpgwzhpakcqaz6fiy95x19iydj471ca-pcsclite-2.3.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/pi7vpdqikh160rj4vyfh58x0z2hksgj7-libaom-3.11.0-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/r4bvdpg1761bqc4jxn4sqxr6ymbcdw8f-perl5.40.0-Clone-0.46' from 'https://cache.nixos.org'...
copying path '/nix/store/vqhxms7i64vb86p07i8q50x32yi9gv5c-perl5.40.0-FCGI-0.82' from 'https://cache.nixos.org'...
copying path '/nix/store/bafwfabi148bigqra4nc5w029zj7dx7c-perl5.40.0-TermReadKey-2.38' from 'https://cache.nixos.org'...
copying path '/nix/store/clh9w1vpiijlv9a1i6gjkvwziwqzsp78-php-calendar-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/bvh4vwr9dr5iaiz57igi5b4mryqnwpaa-php-bcmath-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/vim07ywfgdqz217qnmik9knbmm5glpcn-perl5.40.0-HTTP-Message-6.45' from 'https://cache.nixos.org'...
copying path '/nix/store/n9ch6ggimi6ri5vx62mqmzgrrkb3qfwg-jq-1.7.1-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/gwdxl7y6c54smp9x268diyjqwg1naylk-php-ctype-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/gqmr3gixlddz3667ba1iyqck3c0dkpvd-gnugrep-3.11' from 'https://cache.nixos.org'...
copying path '/nix/store/ybd0aamz6dwc51x1ab62b7qycccqb0z0-libselinux-3.8.1' from 'https://cache.nixos.org'...
copying path '/nix/store/b78nah09ykpmxky3br6fl5akjjwcg1g5-php-dom-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/mq46ybxbw3f7jcv07hlk06sa8cqwy4f8-php-exif-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/wfsrpgcf3mpl9np032nhj6rvz53y4la5-php-fileinfo-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/1f91m6wkjrm4v6317za4wklgqh6qraan-php-filter-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/j6rlrmd7jqk1902s98mjjxj8d563hv8q-perl5.40.0-HTML-Parser-3.81' from 'https://cache.nixos.org'...
copying path '/nix/store/cxqg93vhjddswn75f5bdzbkcpbix32gg-perl5.40.0-HTTP-Cookies-6.10' from 'https://cache.nixos.org'...
copying path '/nix/store/dfyds9allpyy0nwhr2j729jvkb49mrxn-perl5.40.0-HTTP-Daemon-6.16' from 'https://cache.nixos.org'...
copying path '/nix/store/z88zb5wamza6irc3lkz6aj52ga3q5sl3-libaom-3.11.0-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/3pd9kl0nnn22in35ad4p6v5zha8s24gj-perl5.40.0-HTTP-Negotiate-6.01' from 'https://cache.nixos.org'...
copying path '/nix/store/9mlfyarh1nzzzc0zsn6vf0axdjjbq2l4-gpm-unstable-2020-06-17' from 'https://cache.nixos.org'...
copying path '/nix/store/66ld17ifbjz63firjjv88aydxsc3rcs6-less-668' from 'https://cache.nixos.org'...
copying path '/nix/store/xs1qm9vidbfn1932z9csmnwdkrx4lch6-libedit-20240808-3.1' from 'https://cache.nixos.org'...
copying path '/nix/store/i9b4ix0ih1qnf2b4rj0sxjzkqzqhg7mk-php-ftp-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/bssap9z2zp2nbzzr6696dqqr6axac57g-php-gettext-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/fmf4095h2x5rxsgsyxz655049k2ibchl-perl5.40.0-CGI-4.59' from 'https://cache.nixos.org'...
copying path '/nix/store/pdknwq3rbhy1g6343m4v45y98zilv929-php-gmp-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/f3fc31rc8gnmbbz0naniaay6406y5xy8-php-iconv-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/sxi98visi8s3yk1p05ly3pljh683wg1f-php-intl-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/h57pwp22kkwjm3yqh3id3ms2jymc27rq-php-mbstring-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/jlzg258kgf0k3vcn54p91n43kb8afllk-php-mysqli-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/j0ljd9127519pkb01zbyxcf42kjhp2l8-aws-c-cal-0.8.0' from 'https://cache.nixos.org'...
copying path '/nix/store/axi2kqwlrr7lvkfj42p7mav2x7apffrq-coreutils-full-9.7' from 'https://cache.nixos.org'...
copying path '/nix/store/bmjb20jhxkq881f43pd264240sp677an-krb5-1.21.3-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/4qks83jh0avrs4111c6rlwn3llqlily0-ldns-1.8.4' from 'https://cache.nixos.org'...
copying path '/nix/store/bmckdjhp1cn78n4md1m55zglpqxwijj3-libtpms-0.10.0' from 'https://cache.nixos.org'...
copying path '/nix/store/d4knbwl8kbkaai01bp5pb0ph0xpb7bnz-perl5.40.0-Net-SSLeay-1.92' from 'https://cache.nixos.org'...
copying path '/nix/store/idgpi0g62yyq8plhrdc2ps2gcrkd44jz-dash-0.5.12' from 'https://cache.nixos.org'...
copying path '/nix/store/fipal54rj1drz2q567pacy6q2gsnm2hq-php-opcache-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/57csaam2bhhfzbhw2j70ylaxq25wj09g-php-openssl-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/lapy63xgm8gpjbxj55g4w74idmbnavzm-php-pcntl-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/n1lb8wdk0avd6f06fhiydwa2f4x91pz4-php-pdo-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/47kv3znq3rx6lhp5h3shs2zx0gd7r3zv-php-pdo_mysql-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/0gn4mrgjlx9dhffs19yshpdrhi9pcbyl-perl5.40.0-CGI-Fast-2.16' from 'https://cache.nixos.org'...
copying path '/nix/store/yqj36zvdh3nmv5fpyfsp5mr06h1n4npc-php-posix-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/7yi71ajqcpsdmz1qa6r8aprm6vgqj74s-php-session-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/fmj7d945ychgybv2bld5dnk4lzcm1m10-php-simplexml-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/w8cc84nvjzikcrgfc0pi8qap5wiq1cb8-php-soap-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/9rz570nz0d51y8r6dmjniqqbzgc4bnrg-php-sockets-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/svl7fda13ygkdyvisywihlsslrcqvbp8-php-sodium-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/70aw279ymnhp8dadzwfk4clh4f1m7wsn-php-sysvsem-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/3ym3wzyd6z19hfqzqwchzqbd9vzdk345-php-tokenizer-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/j04vms72z22650kvbx69b8qkpbgi5na6-php-xmlreader-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/n75k5rjvbc7gfp4zpajpab5rykajdcmy-perl5.40.0-IO-Socket-SSL-2.083' from 'https://cache.nixos.org'...
copying path '/nix/store/1c0xn8mx6ha6fcpaxi4p0q16lvr7zfrr-php-xmlwriter-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/r7sp55wajh5p7yh2ahgifr1c8jbqjgnl-pixman-0.44.2' from 'https://cache.nixos.org'...
copying path '/nix/store/dli16nly2z52s1mi1phbcgmhw7nkq7x6-pkg-config-0.29.2' from 'https://cache.nixos.org'...
copying path '/nix/store/6mnmfhfsz94zgsyskz7zanian98ssykf-bind-9.20.9-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/7v3h0hsyq17zl8wd7zpkzhif215ywagw-cyrus-sasl-2.1.28' from 'https://cache.nixos.org'...
copying path '/nix/store/004mzgs45wax9qlxrqpzhjbnzz049gsy-gsasl-2.2.2' from 'https://cache.nixos.org'...
copying path '/nix/store/039lh7h4fv88p1mxybhw35fz6y3y5mb3-libpq-17.5' from 'https://cache.nixos.org'...
copying path '/nix/store/xam5b9xk11767mz35dc9i5gcmy9ggsaf-popt-1.19' from 'https://cache.nixos.org'...
copying path '/nix/store/8vqqb9hp35whmp9fxd4c01z2zrdy8g5g-pre-switch-checks' from 'https://cache.nixos.org'...
copying path '/nix/store/1l2x502h3j9bkp2ln3axm9qp70ibg7a1-qrencode-4.1.1' from 'https://cache.nixos.org'...
copying path '/nix/store/hng18h8w0w3axygpknq9p9pn7yd0c1m5-rapidcheck-0-unstable-2023-12-14' from 'https://cache.nixos.org'...
copying path '/nix/store/971mpk4nqhqcxggx0yi60w9y1ya570bj-readline-8.2p13' from 'https://cache.nixos.org'...
copying path '/nix/store/6vw0k4y7zxrbl3sikwbmn8aflzyi923q-s2n-tls-1.5.17' from 'https://cache.nixos.org'...
copying path '/nix/store/mi0yqfw3ppyk0a4y6azvijaa4bmhg70y-system-sendmail-1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/v03zr9slrp64psxlpwh7gn0m5gcdglwm-systemd-minimal-libs-257.5' from 'https://cache.nixos.org'...
copying path '/nix/store/qih5jc5im739yjgdslbswyxmz8kslqdl-perl5.40.0-Net-SMTP-SSL-1.04' from 'https://cache.nixos.org'...
copying path '/nix/store/wcawvp0ilpqmmjfx8z6nbcsmcbpfa6i7-logrotate-3.22.0' from 'https://cache.nixos.org'...
copying path '/nix/store/4iqgxg1ixmnvf8cq6jagz6ipas0p4bg5-tbb-2021.11.0' from 'https://cache.nixos.org'...
copying path '/nix/store/c96bpmpg46wr7pq4ls8k56jrlysmz9nr-time-1.9' from 'https://cache.nixos.org'...
copying path '/nix/store/2crk9xnq5x9v7yf0r2nwkgj8qsmxr4ly-pkg-config-wrapper-0.29.2' from 'https://cache.nixos.org'...
copying path '/nix/store/pxflxwl6fa54jjc619fqdja5z4fn5p35-openldap-2.6.9' from 'https://cache.nixos.org'...
copying path '/nix/store/7rdy1m9afs7036hwhf1r8lw1c900bmfb-php-pdo_pgsql-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/hsnmgsywiz5izr59sm8q1fwcs64d8p85-php-pgsql-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/xa5nkrg7h2akk0962c3k9hxly104yq0k-tree-sitter-0.25.3' from 'https://cache.nixos.org'...
copying path '/nix/store/8y5hcryppj548yfx6akiw93qrw8zv6js-unbound-1.23.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/08inrjiy9snpmn77siczc0ncwpcbfv4v-unit-script-container_-post-start' from 'https://cache.nixos.org'...
copying path '/nix/store/8afssqln4ckx4ii555ly707i2xvk17xy-unit-script-container_-pre-start' from 'https://cache.nixos.org'...
copying path '/nix/store/xqmz2x2zmg6w76wl1b1kznv0b4x7dfr6-perl5.40.0-ExtUtils-PkgConfig-1.16' from 'https://cache.nixos.org'...
copying path '/nix/store/1q9lw4r2mbap8rsr8cja46nap6wvrw2p-bash-interactive-5.2p37' from 'https://cache.nixos.org'...
copying path '/nix/store/jp25r6a51rfhnapv9lp8p00f2nzmfxxz-bind-9.20.9-host' from 'https://cache.nixos.org'...
copying path '/nix/store/xzv4hkskh8di1mk7ik75rvbkyr7is882-guile-2.2.7' from 'https://cache.nixos.org'...
copying path '/nix/store/llcqfkdwbj1m1s4fbby82b49hffxqdb0-php-readline-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/a7j3s4lqfa5pfrxlddmmkxx3vjz6mjzf-aws-c-io-0.15.3' from 'https://cache.nixos.org'...
copying path '/nix/store/vj7inkvjyd3s0r30h4b5pq81f4jlkffr-tbb-2021.11.0-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/lqn8cpyf4nq8704p7k3wjbym51q87rh3-unit-script-post-resume-start' from 'https://cache.nixos.org'...
copying path '/nix/store/3wb1ngcfqajx6slx4c335lvb83js9csr-unit-script-pre-sleep-start' from 'https://cache.nixos.org'...
copying path '/nix/store/03pbln3nwbxc6ars4gwskgci3wj557yy-unit-script-prepare-kexec-start' from 'https://cache.nixos.org'...
copying path '/nix/store/zf9xnwy0r9mzm3biig8b56hgyahfhf6b-unit-script-sshd-keygen-start' from 'https://cache.nixos.org'...
copying path '/nix/store/2pvhq9kgqh5669qj6805vpasngivad8h-lvm2-2.03.31-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/3z98iawifra8xn74bmdda6xbwgr5z0lh-unit-script-systemd-timesyncd-pre-start' from 'https://cache.nixos.org'...
copying path '/nix/store/c5vpbrb9iiq9jynnx57f0h114qar1dkw-unixODBC-2.3.12' from 'https://cache.nixos.org'...
copying path '/nix/store/675r4l9rpmaxdanw0i48z4n7gzchngv7-util-linux-minimal-2.41-login' from 'https://cache.nixos.org'...
copying path '/nix/store/0jj4mmc0861dqz2h603v76rny65mjidx-vim-9.1.1336-xxd' from 'https://cache.nixos.org'...
copying path '/nix/store/dqcl4f3r1z7ck24rh9dw2i6506g7wky5-which-2.23' from 'https://cache.nixos.org'...
copying path '/nix/store/bbq0c28cvahc9236sp33swq4d3gqn2rc-xlsfonts-1.0.8' from 'https://cache.nixos.org'...
copying path '/nix/store/n50daiwz9v6ijhw0inflrbdddq50k3sq-aws-c-event-stream-0.5.0' from 'https://cache.nixos.org'...
copying path '/nix/store/7j95a3ykfjgagicfam6ga6gds2n45xc0-aws-c-http-0.9.2' from 'https://cache.nixos.org'...
copying path '/nix/store/kqrqlcdqs48qslxsqsnygdyx32w7lpwg-php-ldap-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/a232zjl9jnmhq56hfr5n5lz4qg5fpb83-xxHash-0.8.3' from 'https://cache.nixos.org'...
copying path '/nix/store/za3c1slqlz1gpm6ygzwnh3hd2f0lg31z-libblake3-1.8.2' from 'https://cache.nixos.org'...
copying path '/nix/store/mzvz45f54a0r0zjjygvlzn6pidfkkwj3-audit-4.0.3-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/s5cy3qgb3w0i1ylwm8dbsnk3m5jqxik4-m17n-db-1.8.10' from 'https://cache.nixos.org'...
copying path '/nix/store/7d0871d6pn8r51sbpclclg56mmrq761a-nix-info' from 'https://cache.nixos.org'...
copying path '/nix/store/xzfhjkn4am173n6klibs9ikvy1l08hfg-nixos-firewall-tool' from 'https://cache.nixos.org'...
copying path '/nix/store/i5323bb72x07y56d8z2iwb589g56k2y8-vim-9.1.1336' from 'https://cache.nixos.org'...
copying path '/nix/store/pa60s415p92gnhv5ffz1bmfgzzfvhvd8-xz-5.8.1' from 'https://cache.nixos.org'...
copying path '/nix/store/srby6wmvg7dp454pwb6qvaxdiri38sc1-zlib-1.3.1' from 'https://cache.nixos.org'...
copying path '/nix/store/k8asbblj2xn748rslklcll68b4ygh2am-zlib-ng-2.2.4' from 'https://cache.nixos.org'...
copying path '/nix/store/3p844hlrf7c7n8jpgp4y4kn9y4jffn4i-php-pdo_odbc-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/myzfn9vnyglhq3vj4wf99fi8qj98mqri-zlib-ng-2.2.4' from 'https://cache.nixos.org'...
copying path '/nix/store/vcrjkcll3rnr95xjql8rz57gjlhh2267-zsh-5.9' from 'https://cache.nixos.org'...
copying path '/nix/store/d8vq999dg607ha6718fimpakacfax0gd-zstd-1.5.7' from 'https://cache.nixos.org'...
copying path '/nix/store/4rdbzw9g2vpyvs0b07pgmc1554pwdma4-aws-c-auth-0.8.1' from 'https://cache.nixos.org'...
copying path '/nix/store/k4xya9rihwkd175zxvcfnsqbzwrsgwmb-aws-c-mqtt-0.11.0' from 'https://cache.nixos.org'...
copying path '/nix/store/974a51073v6cb7cr5j0dazanxzmk9bxg-binutils-2.44-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/a7h3ly9qzh8wk1vsycpdk69xp82dl5ry-cracklib-2.10.0' from 'https://cache.nixos.org'...
copying path '/nix/store/p3sknfsxw0rjmxbbncal6830ic9bbaxv-audit-4.0.3-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/rrnlyc5y7gd5b0f91a89vbw1flhnlm73-file-5.46' from 'https://cache.nixos.org'...
copying path '/nix/store/9ds850ifd4jwcccpp3v14818kk74ldf2-gcc-14.2.1.20250322' from 'https://cache.nixos.org'...
copying path '/nix/store/yba197xwc8vvxv9wmcrs9bngmmgp5njb-gnutls-3.8.9' from 'https://cache.nixos.org'...
copying path '/nix/store/4f7ssdb8qgaajl4pr1s1p77r51qsrb8y-kexec-tools-2.0.29' from 'https://cache.nixos.org'...
copying path '/nix/store/fjnh5mgnlsahv2vsb8z1jh41ci924f7k-aws-c-s3-0.7.1' from 'https://cache.nixos.org'...
copying path '/nix/store/5f0bv68v1sjrp4pnr8c6p7k04271659w-libfido2-1.15.0' from 'https://cache.nixos.org'...
copying path '/nix/store/qvyvscqgr6vyqvmjdgxqa521myv5db0p-kmod-31' from 'https://cache.nixos.org'...
copying path '/nix/store/89bxhx3rhk6r4d5fvwaysrykpmvmgcnm-kmod-31-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/v63bxfiacw082c7ijshf60alvvrpfxsq-binutils-2.44' from 'https://cache.nixos.org'...
copying path '/nix/store/g91dviqva4rkkw8lw30zy3gj14c1p23s-libarchive-3.7.8-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/dm19r683p4f07v2js5jnfnja5l296gs6-aws-crt-cpp-0.29.4' from 'https://cache.nixos.org'...
copying path '/nix/store/jqhlcbmg1fsvc8w2w3ai9f9i8lzk7yfv-libgccjit-14.2.1.20250322' from 'https://cache.nixos.org'...
copying path '/nix/store/y3x4m9wy3a731ibvgvs194j10znc392m-libpng-apng-1.6.46' from 'https://cache.nixos.org'...
copying path '/nix/store/9642gi5dl4w9nkhab0l6xry685cg403c-libssh2-1.11.1' from 'https://cache.nixos.org'...
copying path '/nix/store/yn4y14blp0j4l9044jxzjzf9i11kpjsx-libzip-1.11.3' from 'https://cache.nixos.org'...
copying path '/nix/store/vrdwlbzr74ibnzcli2yl1nxg9jqmr237-linux-pam-1.6.1' from 'https://cache.nixos.org'...
copying path '/nix/store/7m1s3j4inc333vynaahynfgda1284iyh-m17n-lib-1.8.5' from 'https://cache.nixos.org'...
copying path '/nix/store/5i64l61if26whc3r9lzq6ycxpd2xnlgm-freetype-2.13.3' from 'https://cache.nixos.org'...
copying path '/nix/store/zr0hlr1hybxs08j44l38b8na1m8xpkms-libwebp-1.5.0' from 'https://cache.nixos.org'...
copying path '/nix/store/v578vkzh0qhzczjvrzf64lqb2c74d5pk-curl-8.13.0' from 'https://cache.nixos.org'...
copying path '/nix/store/hd1ys7pkiablfdgjvd1aq15k9jplsm2j-libgit2-1.9.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/1dxfw2zshri809ddyfqllvff3cfj96ma-libmicrohttpd-1.0.1' from 'https://cache.nixos.org'...
copying path '/nix/store/g81krn6p9fmyb2ymkd6d7cndjma3hzq0-etc-shells' from 'https://cache.nixos.org'...
copying path '/nix/store/2vd9h77mrciiff8ldj1260qd6dlylpvh-nano-8.4' from 'https://cache.nixos.org'...
copying path '/nix/store/9wlknpyvdm3n4sh6dkabs0za1n5nvfjn-aws-sdk-cpp-1.11.448' from 'https://cache.nixos.org'...
copying path '/nix/store/s2np0ri22gq9pq0fnv3yqjsbsbmw16xi-curl-8.13.0-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/cly4pxh7avd579girjmpxmx8z6ad4dyp-elfutils-0.192' from 'https://cache.nixos.org'...
copying path '/nix/store/ldn53xpxivf489d7z673c95fkihs5l8r-fontconfig-2.16.0-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/vam5p76i7kbh1pwhdvlrhb33wgyfzy6x-chfn.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/yr8x6yvh2nw8j8cqxana4kwn8qp9pjh2-chpasswd.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/d4zhdmcqi6z247436jqahvz8v1khrcbi-chsh.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/p83i191brxfj966zk8g7aljpb8ixqy1m-groupadd.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/xlxar4qknywb8i3rf27g8v85l6vxlh2j-groupdel.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/kkv7k9fmsfy5iljy4y2knirmrpkbplzs-groupmems.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/76gwx407dhh1ni9fn64h0yha3c1zwabp-groupmod.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/qzk9gj8jdl37xqiccxa98g442byp3rrq-libtiff-4.7.0' from 'https://cache.nixos.org'...
copying path '/nix/store/64zabz1hxymxbcvp78hp9kacrygnf9l9-fontconfig-2.16.0-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/ka4yf6hhsx1vlqkff4bvrvn27kbp28gg-mariadb-connector-c-3.3.5' from 'https://cache.nixos.org'...
copying path '/nix/store/gs6syc444yjbg5ivf36sn535chg6mkrx-libXft-2.3.9' from 'https://cache.nixos.org'...
copying path '/nix/store/bmmmy3sz3fmlxx64rlw1apm7ffywpyap-libpwquality-1.4.5-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/3qk4g71ciq7hf06nmy252qf4ng36g0s7-nginx-1.28.0' from 'https://cache.nixos.org'...
copying path '/nix/store/i0nlz4mcyxzxd96x5dv0zcy23z6xkvzy-openssh-10.0p2' from 'https://cache.nixos.org'...
copying path '/nix/store/s1c3kiwdwcff03bzbikya9bszz45mmkc-etc-nanorc' from 'https://cache.nixos.org'...
copying path '/nix/store/pn4js43jj8ag504ib4dyf5vd5ap2ilkg-libwebp-1.5.0' from 'https://cache.nixos.org'...
copying path '/nix/store/gg124glj125xfc8jzvkl6r47ph8nl6pw-passwd.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/fx0cjyvqjmfnbqxcd60bwaf36ak16q2q-pciutils-3.13.0' from 'https://cache.nixos.org'...
copying path '/nix/store/al9x8cr5xifp3qd2f5cdzh6z603kb5ps-perl-5.40.0' from 'https://cache.nixos.org'...
copying path '/nix/store/ls9jrqk9arnwrm3cmm1gd9wgllpn4b3b-php-curl-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/7hnpi2q3cxfzkzh7miv5rkl4b74gpzk4-php-imap-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/cq6kbdji0q5c3r2m0ikaaiip5z0p6318-php-mysqlnd-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/w0nkgg89ls4xvk49lnv483blmhq2ac9x-php-zip-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/h1z0wlyb4av929a6qkxblhndha0d6byn-php-zlib-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/7f3nwfvk0f32663rz1xn38cbsl66idx2-libbpf-1.5.0' from 'https://cache.nixos.org'...
copying path '/nix/store/mww31587ng38jw87pf1dv121ih27clf5-plocate-1.1.23' from 'https://cache.nixos.org'...
copying path '/nix/store/sb4ml8qjxcr2idzdgjcw2bz11p6nzff4-rsync-3.4.1' from 'https://cache.nixos.org'...
copying path '/nix/store/hhfm5fkvb1alg1np5a69m2qlcjqhr062-binutils-wrapper-2.44' from 'https://cache.nixos.org'...
copying path '/nix/store/7nln4vh5kbwba6q9d3ga77vk2vj72mdk-runuser-l.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/qb0d1xc8vxdr8s2bkp4q8msj8bhkvmg8-runuser.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/wgq5kj4qhi78sr70mwj3bgnmx4ya87fr-security-wrapper-unix_chkpwd-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/c0clf3w4bfkcg9fc7nl6bfhgivz24wvc-shishi-1.0.2' from 'https://cache.nixos.org'...
copying path '/nix/store/yfjzkkkyxcalyj7l1n4d4y6s81i65hmy-sqlite-3.48.0' from 'https://cache.nixos.org'...
copying path '/nix/store/x0ncvjhy2vgz174bhm8yycywwrjvgr9a-strace-6.14' from 'https://cache.nixos.org'...
copying path '/nix/store/ywy0hjiydvv561a5wds6ba7z059zj9im-sudo-1.9.16p2' from 'https://cache.nixos.org'...
copying path '/nix/store/qywjdkbap2h7g17qrzhi4nm243cqpx1f-sudo.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/dk55smr7wdjad151r7cv1pln0winqq9x-tcb-1.2' from 'https://cache.nixos.org'...
copying path '/nix/store/g4d8pli27k90s0n4nnm5nipxbyrcd9vl-useradd.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/71v1svlxdziiqy8qmzki3wsrg7yv7ybq-userdel.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/npkqihnvfw9cx3a1mzr59x23vkqql51g-sshd.conf-final' from 'https://cache.nixos.org'...
copying path '/nix/store/bxznmkg59a4s2p559fmbizc2qcgjr3ny-iproute2-6.14.0' from 'https://cache.nixos.org'...
copying path '/nix/store/bqa6kwd5ds2jrj76nch6ixdvzzcy4sxl-usermod.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/b895xnbwyfj1msj6ljcsvwfdhwqhd2vd-shadow-4.17.4' from 'https://cache.nixos.org'...
copying path '/nix/store/zx9qxw749wmla1fad93al7yw2mg1jvzf-vlock.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/z21r5fak3raias1zlc0grawnsrcq094x-X-Restart-Triggers-sshd' from 'https://cache.nixos.org'...
copying path '/nix/store/98zamhd8d0jq3skqwz28dlgph94mrqir-xz-5.8.1-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/l9xn7mbn0wh0z7swfcfj1n56byvcrisw-zstd-1.5.7-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/clfkfybsfi0ihp7hjkz4dkgphj7yy0l4-nix-2.28.3' from 'https://cache.nixos.org'...
copying path '/nix/store/yz7bsddsmyssnylilblxr8gxyaijfis7-php-pdo_sqlite-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/rmj2j70y96zfnl2bkgczc1jjxxp1gpc2-php-sqlite3-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/5c38fjjwfnlfjiiq62qyrr545q0n60ki-util-linux-2.41-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/r03ly1w54924k8fag1dhjl3yrllj6czd-util-linux-minimal-2.41-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/mmz4qa42fhacp04wfjhwlslnlfffyxjv-append-initrd-secrets' from 'https://cache.nixos.org'...
copying path '/nix/store/9r81a64smasyz3j7x3ah684hyzivmplx-kbd-2.7.1' from 'https://cache.nixos.org'...
copying path '/nix/store/324bqqlvdjbsixcbagdn8yjxc6zcj28a-security-wrapper-newgidmap-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/mvgsv5643miclpcpwzv43kibj5ydpxvl-security-wrapper-newgrp-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/n42x8ly03p2dyj6lqnmaynrcw8mg72d7-gss-1.0.4' from 'https://cache.nixos.org'...
copying path '/nix/store/xrdkznkvi79w8pp1cyhzi40prmxilw8y-security-wrapper-newuidmap-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/p7vixy3km13dwf3g4rkg9n3qwkj2vhik-security-wrapper-sg-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/2adjiqpm8p55hfhhrw3f1kvi340allma-security-wrapper-sudo-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/0b1qa8fm793qvcn8bvr5kg5jl4indh9y-security-wrapper-sudoedit-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/4hjw4c56ml09jbac2mzz38qc958d3fb2-shadow-4.17.4-su' from 'https://cache.nixos.org'...
copying path '/nix/store/m4w8d2h3v76anng7s9cv9c1iq9w6y2jj-cryptsetup-2.7.5' from 'https://cache.nixos.org'...
copying path '/nix/store/jf0v9bq4dlk56acbkpq6i84zwjg4g466-e2fsprogs-1.47.2' from 'https://cache.nixos.org'...
copying path '/nix/store/bkpj51fz88rbyjd60i6lrp0xdax1b24g-glib-2.84.1' from 'https://cache.nixos.org'...
copying path '/nix/store/170jn0hjz46hab3376z1fj79vmn0nynm-libSM-1.2.5' from 'https://cache.nixos.org'...
copying path '/nix/store/8w718rm43x7z73xhw9d6vh8s4snrq67h-python3-3.12.10' from 'https://cache.nixos.org'...
copying path '/nix/store/a885zzx9s5y8dxbfvahwdcwcx6pdzm9q-tpm2-tss-4.1.3' from 'https://cache.nixos.org'...
copying path '/nix/store/m2dkj8xcpcrymd4f4p46c3m59670cj9y-security-wrapper-su-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/d4icl77wfbz3y5py1yni18nmqwkrb4lr-libSM-1.2.5-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/2rxzdljx3dp4cgj1xlald496gdsjnwj8-libXt-1.3.1' from 'https://cache.nixos.org'...
copying path '/nix/store/agpxymqp96k4bksyz3bbzr5y8jgykf4p-util-linux-minimal-2.41-mount' from 'https://cache.nixos.org'...
copying path '/nix/store/sjsapivqvz7hs93rbh1blcd7p91yvzk1-console-env' from 'https://cache.nixos.org'...
copying path '/nix/store/jws80m7djgv03chq0ylw7vmv3vqsbvgg-util-linux-minimal-2.41-swap' from 'https://cache.nixos.org'...
copying path '/nix/store/2050009wgldpv3lxld3acz5pr6cr7x53-wget-1.25.0' from 'https://cache.nixos.org'...
copying path '/nix/store/77z9fh96318kyjmmidi558hyyssv00s8-bcache-tools-1.0.8' from 'https://cache.nixos.org'...
copying path '/nix/store/lg0d9891d12dl3n1nm68anmlf3wczf28-btrfs-progs-6.14' from 'https://cache.nixos.org'...
copying path '/nix/store/cx6fbilhj4nmq9dl8c8c73mimm08x60z-e2fsprogs-1.47.2-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/hlmmf01lhg62fpqhzispzs8rhzn7gg4p-libXmu-1.2.1' from 'https://cache.nixos.org'...
copying path '/nix/store/wfxr783my1pr6pnzd6x22dpi8amjwkkd-X-Restart-Triggers-reload-systemd-vconsole-setup' from 'https://cache.nixos.org'...
copying path '/nix/store/m506rljkkpxc4d0j0j41qjhldqrwxz4x-libXt-1.3.1-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/9qqln0vxf1g6ll2wpkdfa2cmpm4nn17y-libXaw-1.0.16' from 'https://cache.nixos.org'...
copying path '/nix/store/if9z6wmzmb07j63c02mvfkhn1mw1w5p4-systemd-257.5' from 'https://cache.nixos.org'...
copying path '/nix/store/ykzprjkb2l61gnlcm368vh8wnj7adwx6-systemd-minimal-257.5' from 'https://cache.nixos.org'...
copying path '/nix/store/mwan3006nzdq6ia8lw3hyk4vlc585g17-libXmu-1.2.1-dev' from 'https://cache.nixos.org'...
copying path '/nix/store/1nxchlxi7i0b1nhsyq732al8sm1blywm-util-linux-2.41-login' from 'https://cache.nixos.org'...
copying path '/nix/store/mhibs2q4f3mjpzwgm6wdk2c4d6vkaklv-Xaw3d-1.6.6' from 'https://cache.nixos.org'...
copying path '/nix/store/g51ca42mmgxzz7xngf0jzhwd4whi19lj-util-linux-2.41-mount' from 'https://cache.nixos.org'...
copying path '/nix/store/8m86a49g1p7fvqiazi5cdmb386z7w5zf-libotf-0.9.16' from 'https://cache.nixos.org'...
copying path '/nix/store/zma6jllb9xn22i98jy9n8mz3wld9njwk-util-linux-2.41-swap' from 'https://cache.nixos.org'...
copying path '/nix/store/mw6bvyrwv9mk36knn65r80zp8clnw9jl-util-linux-minimal-2.41-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/94jfyay8h0dwbakr69b91rsf8pdvah05-xauth-1.1.4' from 'https://cache.nixos.org'...
copying path '/nix/store/j3h72p435glzylc2qjny8lqd4gml03ym-xrdb-1.2.2' from 'https://cache.nixos.org'...
copying path '/nix/store/zpprhp27r6chnkfkb85wl42p33vsawj8-su.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/41qbms27n02859ja4sc7wsd9mfp3ward-cairo-1.18.2' from 'https://cache.nixos.org'...
copying path '/nix/store/8bsl1vrab2pwj8ilpzfn2iwzbrps8jgq-harfbuzz-10.2.0' from 'https://cache.nixos.org'...
copying path '/nix/store/j61kasrhqidgpj9l9zb1wvnizk1bsiqf-qemu-host-cpu-only-9.2.3-ga' from 'https://cache.nixos.org'...
copying path '/nix/store/cdr8agx3ffy086z30wiysnclrc5m8x69-gdk-pixbuf-2.42.12' from 'https://cache.nixos.org'...
copying path '/nix/store/b5xcfdnccfslm89c8kd3lajgb5drx3h4-shared-mime-info-2.4' from 'https://cache.nixos.org'...
copying path '/nix/store/y7gbjn3x388syav1bjzciia9ppia2zqw-urxvt-font-size-1.3' from 'https://cache.nixos.org'...
copying path '/nix/store/hb6l900n8qiaxg0zj6l20yy7bn9ghxp3-wmctrl-1.07' from 'https://cache.nixos.org'...
copying path '/nix/store/808xr68djvk0x3r754mi81yvm2yr9ppq-libavif-1.2.1' from 'https://cache.nixos.org'...
copying path '/nix/store/y5bgh0pyxzcgp90ywmgl9dk2m1j3hcbr-urxvt-perl-unstable-2015-01-16' from 'https://cache.nixos.org'...
copying path '/nix/store/z0kbsgnma0mijn5ssqfi3dk9z28bqlwj-pango-1.56.3' from 'https://cache.nixos.org'...
copying path '/nix/store/lzj596ffj1xk0r9v9l4gpgwg9w8jb0fr-check-mountpoints' from 'https://cache.nixos.org'...
copying path '/nix/store/5q42cwjbqj7ir7pvdqn411bbzr304g2j-etc-systemd-system.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/7c0l3jk0fszisqidxrc2bby99dv5d261-fuse-2.9.9-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/nmyh57dqf1v6l6swghywkrb63aqmzzh8-fuse-3.16.2' from 'https://cache.nixos.org'...
copying path '/nix/store/dh54wizfsivqa4ygx76jn49lpxkqbaf6-lvm2-2.03.31-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/zp3db1aj7gs7p73wkm9v76x36z641nsi-man-db-2.13.0' from 'https://cache.nixos.org'...
copying path '/nix/store/qcf53qls5h6jk0czdiwdwncfvfnvfmpb-gd-2.3.3' from 'https://cache.nixos.org'...
copying path '/nix/store/9z8dbw37f5b6nrmh07310g1b2kdcs8sf-nixos-enter' from 'https://cache.nixos.org'...
copying path '/nix/store/14spdmgq38vmzywkkm65s65ab6923y6p-librsvg-2.60.0' from 'https://cache.nixos.org'...
copying path '/nix/store/csx6axnwacbq8ypl375p10why1fc2z8p-security-wrapper-fusermount-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/j1d4jkh31x2yq5c8pibjifwcm5apa06l-fuse-3.16.2-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/j3jbm4d3hmz0nh4z3pqfy68zgil8immv-nixos-install' from 'https://cache.nixos.org'...
copying path '/nix/store/0r8953vg0n1b38d0jkk9lgbjfxvf8yc4-php-gd-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/7hg38dsdzfk0jnb9q3q77ql9q1chp4fz-nixos-option' from 'https://cache.nixos.org'...
copying path '/nix/store/mz9qpdl066bzg4n3rzb7x82dmx5jy386-security-wrapper-fusermount3-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/sm4b1vl7578rl2yiss62acs7ls7qinad-lvm2-2.03.31' from 'https://cache.nixos.org'...
copying path '/nix/store/gxsa0wrpl9r1vl2zp3s1vkhmdf8ia0ca-php-extra-init-8.1.32.ini' from 'https://cache.nixos.org'...
copying path '/nix/store/yi0knhi2qccafj49a8yd76rizllzx7bd-dbus-1.14.10-lib' from 'https://cache.nixos.org'...
copying path '/nix/store/rys6134aqazihxi4g5ayc0ky829v7mf0-dbus-1.14.10' from 'https://cache.nixos.org'...
copying path '/nix/store/75m2zly9vl6gvx3gc23y7hgjsbarqf7r-switch-to-configuration-0.1.0' from 'https://cache.nixos.org'...
copying path '/nix/store/6kldkgh0i8h6wwfi78nviki6a15h03bw-perl-5.40.0-env' from 'https://cache.nixos.org'...
copying path '/nix/store/rqy3y1p2c1acfnbhkxzpixdshnivqaxl-perl-5.40.0-env' from 'https://cache.nixos.org'...
copying path '/nix/store/zql0aksg8vpmaivh4ylkzg8ky4k1r3ms-perl-5.40.0-env' from 'https://cache.nixos.org'...
copying path '/nix/store/4ccfn37h8jfpppsi2i0rx0dx9c73qmsa-perl5.40.0-DBI-1.644' from 'https://cache.nixos.org'...
copying path '/nix/store/gf1gs0w896yg73wyphgwdzhwa08ryw3n-perl5.40.0-String-ShellQuote-1.04' from 'https://cache.nixos.org'...
copying path '/nix/store/p90lckzsmp16zh0rfx7pfc6ryf77y3c6-perl5.40.0-libwww-perl-6.72' from 'https://cache.nixos.org'...
copying path '/nix/store/20f0f68rsai61a7rkcy6zxl6c0vh1z41-perl5.40.0-urxvt-bidi-2.15' from 'https://cache.nixos.org'...
copying path '/nix/store/g5lvfibif6crcl82mmzwicq6xwv9dcvf-rxvt-unicode-unwrapped-9.31' from 'https://cache.nixos.org'...
copying path '/nix/store/kdrbnjhy3wchgbpkiz486k0qcv5z9a07-rxvt-unicode-vtwheel-0.3.2' from 'https://cache.nixos.org'...
copying path '/nix/store/xj5y2ng1jbpx99nzi2pjajs5pdjn07rg-security-wrapper-dbus-daemon-launch-helper-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/6v5a3nd0fxwddy5rlgl02hx7qmmb14ky-texinfo-interactive-7.2' from 'https://cache.nixos.org'...
copying path '/nix/store/s8lhl3z9z2jjaq1qschc4g0wd3dy91im-w3m-0.5.3+git20230121' from 'https://cache.nixos.org'...
copying path '/nix/store/50piw9b7b80vfjf9yny54zxfgjx3f3va-etc-ssh-ssh_config' from 'https://cache.nixos.org'...
copying path '/nix/store/dkx6gwpq53a80aya87fi1vs43pr42s91-etc-sysctl.d-60-nixos.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/27xwi66r3zx83cfr2p4nz4d3p8q5mvcd-htop-3.4.1' from 'https://cache.nixos.org'...
copying path '/nix/store/p6z4ag3v1a3bmdd7b2ga8n2s53r3rb7s-login.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/q9idyw3m487xfb10dyk4v773kcyzq2da-php-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/x3bxjpkcbfyzmy5695g1cchf04fbz8ca-procps-4.0.4' from 'https://cache.nixos.org'...
copying path '/nix/store/3wikk2w5zb68g0j90xqcqbn4dhq59910-nixos-generate-config' from 'https://cache.nixos.org'...
copying path '/nix/store/54a3gbciba6is1fvi29k291v04hkgihb-X-Restart-Triggers-systemd-sysctl' from 'https://cache.nixos.org'...
copying path '/nix/store/6pgj3ja7zvlahqbcycd43iyc4g498ki0-perl5.40.0-DBD-SQLite-1.74' from 'https://cache.nixos.org'...
copying path '/nix/store/qvznfa46sqccjdh8vlnpzpqfkqh58s2j-sshd.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/gsman0cwlms2l679bla5vgmf21jc5lvl-systemd' from 'https://cache.nixos.org'...
copying path '/nix/store/q8ycjc7hnjm71p4n106acywcdsjjpskl-systemd-user.pam' from 'https://cache.nixos.org'...
copying path '/nix/store/kyf94km34b9ydzy33gvrvdd893py5pc5-rxvt-unicode-9.31' from 'https://cache.nixos.org'...
copying path '/nix/store/bvvqvfbh0wq04di5f3lkrzjqy5pvq4w3-unit-script-container_-start' from 'https://cache.nixos.org'...
copying path '/nix/store/af291yai47szhz3miviwslzrjqky31xw-util-linux-2.41-bin' from 'https://cache.nixos.org'...
copying path '/nix/store/iqqhya38s39vgh1bk4v5sr6jvrmi5sg3-nixos-help' from 'https://cache.nixos.org'...
copying path '/nix/store/jrrzha35h0bxbp2h30nv4dpa0fk4qhgb-perl-5.40.0-env' from 'https://cache.nixos.org'...
copying path '/nix/store/rmnxnpxvm1wmlmgh5krgdf9wrym5ks99-tailscale-1.82.5' from 'https://cache.nixos.org'...
copying path '/nix/store/xc3zdwldi1bbsrvjjvix7s57s31hsv29-command-not-found' from 'https://cache.nixos.org'...
copying path '/nix/store/rhmacziivxfjs8chklcbm37p59wih6sw-nixos-help' from 'https://cache.nixos.org'...
copying path '/nix/store/7rjs1gm1377hsbd5yqg5bii3ay3f75q7-etc-bashrc' from 'https://cache.nixos.org'...
copying path '/nix/store/s674qd2b7v163k38imvnp3zafzh0585n-50-coredump.conf' from 'https://cache.nixos.org'...
copying path '/nix/store/m4qaar099vcj0dgq4xdvhlbc8z4v9m22-getty' from 'https://cache.nixos.org'...
copying path '/nix/store/rr6bdh3pdsvwjrm5wd32p2yzsz16q6z2-security-wrapper-mount-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/2q4yksm7gqgszl9axs95ylwakwk9yb8w-security-wrapper-umount-x86_64-unknown-linux-musl' from 'https://cache.nixos.org'...
copying path '/nix/store/gmydihdyaskbwkqwkn5w8yjh9nzjz56p-udev-path' from 'https://cache.nixos.org'...
copying path '/nix/store/yaz54h00w6qv85lw40g0s0dw3s4s53ws-unit-script-nixos-activation-start' from 'https://cache.nixos.org'...
copying path '/nix/store/9jp02i4p4lrxz51sxiyhz71shr9vb6bc-mount-pstore.sh' from 'https://cache.nixos.org'...
copying path '/nix/store/csm3q68n81162ykn3wibzh0fs4fm0dhk-nixos-container' from 'https://cache.nixos.org'...
copying path '/nix/store/cflf8pxlaapivg98457bwh0nh1hasf5h-nixos-rebuild' from 'https://cache.nixos.org'...
copying path '/nix/store/rajz07kxw9xj94bi90yy0m2ksgh3wprf-reload-container' from 'https://cache.nixos.org'...
copying path '/nix/store/c2c364mdd4qj6c51bjs6s3g4hb42c0ia-getty' from 'https://cache.nixos.org'...
copying path '/nix/store/298j97sm5jr2x5z8w5q8s3mzzpb3rjjw-unit-script-suid-sgid-wrappers-start' from 'https://cache.nixos.org'...
copying path '/nix/store/p1j5bc30pbq6bqpw2d676azqdv4whdi5-udev-rules' from 'https://cache.nixos.org'...
copying path '/nix/store/rrnq5j9c6j39k4pk9xkk4913h4zsqf5b-php-with-extensions-8.1.32' from 'https://cache.nixos.org'...
copying path '/nix/store/m2rqmjkvdd86lb4i8mi3rafxggf9l2py-X-Restart-Triggers-systemd-udevd' from 'https://cache.nixos.org'...
copying path '/nix/store/805a5wv1cyah5awij184yfad1ksmbh9f-git-2.49.0' from 'https://cache.nixos.org'...
copying path '/nix/store/wk8wmjhlak6vgc29clcfr1dpwv06j2hn-mailutils-3.18' from 'https://cache.nixos.org'...
copying path '/nix/store/kmwlm9nmvszrcacs69fj7hwpvd7wwb5w-emacs-30.1' from 'https://cache.nixos.org'...
copying path '/nix/store/y0aw1y9ggb4pyvhwk97whmwyjadivxny-linux-6.12.30-modules' from 'https://cache.nixos.org'...
copying path '/nix/store/1i4iz3z0b4f4qmbd9shs5slgfihs88vc-firmware' from 'https://cache.nixos.org'...
copying path '/nix/store/18f0xc0gid1ma6yjjx5afny9lnji3hf0-etc-modprobe.d-firmware.conf' from 'https://cache.nixos.org'...
### Installing NixOS ###
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
installing the boot loader...
setting up /etc...
Created "/boot/EFI".
Created "/boot/EFI/systemd".
Created "/boot/EFI/BOOT".
Created "/boot/loader".
Created "/boot/loader/keys".
Created "/boot/loader/entries".
Created "/boot/EFI/Linux".
Copied "/nix/store/if9z6wmzmb07j63c02mvfkhn1mw1w5p4-systemd-257.5/lib/systemd/boot/efi/systemd-bootx64.efi" to "/boot/EFI/systemd/systemd-bootx64.efi".
Copied "/nix/store/if9z6wmzmb07j63c02mvfkhn1mw1w5p4-systemd-257.5/lib/systemd/boot/efi/systemd-bootx64.efi" to "/boot/EFI/BOOT/BOOTX64.EFI".
Random seed file /boot/loader/random-seed successfully written (32 bytes).
Created EFI boot entry "Linux Boot Manager".
installation finished!
### Rebooting ###
Pseudo-terminal will not be allocated because stdin is not a terminal.
Warning: Permanently added '10.25.0.87' (ED25519) to the list of known hosts.
### Waiting for the machine to become unreachable due to reboot ###
kex_exchange_identification: read: Connection reset by peer
Connection reset by 10.25.0.87 port 22
### Done! ###
nix run github:nix-community/nixos-anywhere -- --flake .#wiki    --target-hos  6,58s user 2,98s system 17% cpu 55,063 total

Post-Installation Steps

Now that the declarative part of the system is in place, we need to take care of the stateful part.

In my case, the only stateful part that needs setting up is the Tailscale mesh VPN.

To set up Tailscale, I log in via SSH and run sudo tailscale up. Then, I add the new node to my network by following the link. Afterwards, in the Tailscale Machines console, I disable key expiration and add ACL tags.

Making Changes

Now, after I changed something in my configuration file, I use nixos-rebuild remotely to roll out the change to my NixOS system:

% nix run nixpkgs#nixos-rebuild -- \
  --target-host michael@zammadn \
  --use-remote-sudo \
  switch \
  --flake .#zammadn

Note that not all changes are fully applied as part of nixos-rebuild switch: while systemd services are generally restarted, newly required kernel modules are not automatically loaded (e.g. after enabling the edgetpu coral hardware accelerator in Frigate).

So, to be sure everything took effect, reboot your system after deploying changes.

One of the advantages of NixOS is that in the boot menu, you can select which generation of the system you want to run. If the latest change broke something, you can quickly reboot into the previous generation to undo that change. Of course, you can also undo the configuration change and deploy a new generation — whichever is more convenient in the situation.

Conclusion

With this article, I hope I could convey what I wish someone would have told me when I started using Nix and NixOS:

Enable flakes and the new CLI.
Use nixos-anywhere to install remotely.
- Build a custom installer if you want, it’s easy!
Use nixos-rebuild’s builtin --target-host flag for remote deployment.

Where do you go from here?

Read through all documentation on nixos.org → Learn.
Here are a couple of posts from people in and around my bubble that I looked at for inspiration / reference, in no particular order:
- Michael Lynch wrote about setting up an Oracle Cloud VM with NixOS and about managing his Zig configuration.
- Yvan shared some use-cases on Mastodon.
- Nelson Elhage wrote about using Nix to test dozens of Python interpreters as part of his performance investigation into Python 3.14 tail-call interpreter performance.
- Vincent Bernat wrote about using Nix to build an SD card image for an ARM single board computer.
- Mitchell Hashimoto shared his extensive NixOS configs.
- Wolfgang has a YouTube video about using NixOS for his Home Server (→ his configs)
Contact your local Nix community! I recently attended the “Zero Hydra Failures” event of the Nix Zürich group and the kind people there were happy to talk about all things Nix :)

Turns out my previous attempt at this build had a faulty CPU! With the CPU replaced, the machine now is stable and fast! 🚀 In this article, I’ll go into a lot more detail about the component selection, but in a nutshell, I picked an Intel 285K CPU for low idle power, chose a 4TB SSD so I don’t have to worry about running out of storage quickly, and a capable nvidia graphics card to drive my Dell UP3218K 8K monitor.

Components

Which components did I pick for this build? Here’s the full list:

Price	Type	Article
140 CHF	Case	Fractal Define 7 Compact Black Solid
155 CHF	Power Supply	Corsair RM850x
233 CHF	Mainboard	ASUS PRIME Z890-P
620 CHF	CPU	Intel Core Ultra 9 285K
120 CHF	CPU fan	Noctua NH-D15 G2
39 CHF	Case fan	Noctua NF-A14 PWM (140 mm)
209 CHF	RAM	64 GB DDR5-6400 Corsair Vengeance (2 x 32GB)
280 CHF	Disk	4000 GB Samsung 990 Pro
554 CHF	GPU	MSI GeForce RTX 3060 Ti GAMING X TRIO

Total: 2350 CHF

…and the next couple of sections go into detail on how I selected these components.

Case

I have been a fan of Fractal cases for a couple of generations. In particular, I realized that the “Compact” series offers plenty of space even for large graphics cards and CPU coolers, so that’s now my go-to case: the Fractal Define 7 Compact (Black Solid).

My general requirements for a PC case are as follows:

No extra effort should be required for the case to be as quiet as possible.
The case should not have any sharp corners (no danger of injury!).
The case should provide just enough space for easy access to your components.
The more support the case has to encourage clean cable routing, the better.
USB3 front panel headers should be included.

I really like building components into the case and working with the case. There are no sharp edges, the mechanisms are a pleasure to use and the cable-management is well thought-out.

The only thing that wasn’t top-notch is that Fractal ships the case screws in sealed plastic packages that you need to cut open. I would have wished for a re-sealable plastic baggie so that one can keep the unused screws instead of losing them.

With this build, I have standardized all my PCs into Fractal Define 7 Compact Black cases!

Power Supply

I wanted to keep my options open regarding upgrading to an nvidia 50xx series graphics card at a later point. Those models have a TGP (“Total Graphics Power”) of 575 watts, so I needed a power supply that delivers enough power for the whole system even at peak power usage in all dimensions.

I ended up selecting the Corsair RM850x, which reviews favorably (“leader in the 850W gold category”) and was available at my electronics store of choice.

This was a good choice: the PSU indeed runs quiet, and I really like the power cables (e.g. the GPU cable) that they include: they are very flexible, which makes them easy to cable-manage.

One interesting realization was that it’s more convenient to not use the PSU’s 12VHPWR cable, but instead stick to the older 8-pin power connectors for the GPU in combination with a 12VHPWR-to-8-pin adapter. The reason is that the 12VHPWR connector’s locking mechanism is very hard to unlock, so when swapping out the GPU (as I had to do a number of times while trouble-shooting), unlocking an 8-pin connector is much easier…

SSD disk

I have been avoiding PCIe 5 SSDs so far because they consume a lot more power compared to PCIe 4 SSDs. While bulk streaming data transfer rates are higher on PCIe 5 SSDs, random transfers are not significantly faster. Most of my compute workload are random transfers, not large bulk transfers.

The power draw situation with PCIe 5 SSDs seems to be getting better lately, with the Phison E31T being the first controller that implements power saving. A disk that uses the E31T controller is the Corsair Force Series MP700 Elite. Unfortunately, said disk was unavailable when I ordered.

Instead, I picked the Samsung 990 Pro with 4 TB. I have had good experiences with the Samsung Pro series over the years (never had one die or degrade performance), and my previous 2 TB disk was starting to fill up, so the extra storage space is appreciated.

Onboard 2.5GbE Network Card

One annoying realization is that most mainboard vendors seem to have moved to 2.5 GbE (= 2.5 Gbit/s ethernet) onboard network cards. I would have been perfectly happy to play it safe and buy another Intel I225 1 GbE network card, as long as it just works with Linux.

In the 2.5 GbE space, the main players seem to be Realtek and Intel. Most mainboard vendors opted for Realtek as far as I could see.

Linux includes the r8169 driver for Realtek network cards, but whether the card will work out of the box depends on the exact revision of the network card! For example:

The AsRock Z890 Pro-A has rev 8125B. lshw: firmware=rtl8125b-2_0.0.2 07/13/20
The ASUS PRIME Z890-P has rev 8125D. lshw: firmware=rtl8125d-1_0.0.7 10/15/24

For revision 8125D, you need a recent-enough Linux version (6.13+) that includes commit “r8169: add support for RTL8125D”, accompanied by a recent-enough linux-firmware package.

Even with the latest firmware, there is some concern around stability and ASPM support. See for example this ServerFault post by someone working on the r8169 driver. But, despite the Intel 1 GbE options being well-supported at this point, Intel’s 2.5 GbE options might not fare any better than the Realtek ones: I found reports of instability with Intel’s 2.5 GbE network cards.

That said, aside from the annoying firmware requirements, the Realtek 2.5 GbE card seems to work fine for me in practice.

Mainboard

Despite the suboptimal network card choice, I decided to stick to the ASUS PRIME series of mainboards, as I made good experiences with those in my past few builds. Here are a couple of thoughts on the ASUS PRIME Z890-P mainboard I went with:

I like the quick-release PCIe mechanism: ASUS understood that people had trouble unlocking large graphics cards from their PCIe slot, so they added a lever-like mechanism that is easily reachable. In my couple of usages, this worked pretty well!
I wrote about slow boot times with my 2022 PC build that were caused by time-consuming memory training. On this ASUS board, I noticed that the board blinks the Power LED to signal that memory training is in progress. Very nice! It hadn’t occurred to me previously that the various phases of the boot could be signaled by different Power LED blinking patterns :)
- The downside of this feature is: While the machine is in suspend-to-RAM, the Power LED also blinks! This is annoying, so I might just disconnect the Power LED entirely.
The UEFI firmware includes what they call a Q-Dashboard: An overview of what is installed/connected in which slot. Quite nice:

One surprising difference between the two mainboards I tested was that the AsRock Z890 Pro-A does not seem to report the correct DIMM clock in lshw, whereas the ASUS does:

--- lshw-intel-285k-asrock.txt	2025-04-30 20:35:24 +0200
+++ lshw-intel-285k-asus.txt		2025-04-30 21:39:52 +0200
      *-firmware
           description: BIOS
-          vendor: American Megatrends International, LLC.
+          vendor: American Megatrends Inc.
           physical id: 0
-          version: 2.25
-          date: 03/24/2025
+          version: 1601
+          date: 02/07/2025
           size: 64KiB
-          capacity: 32MiB
+          capacity: 16MiB
           capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd acpi biosbootspecification uefi
[…]
      *-memory
           description: System Memory
-          physical id: 9
+          physical id: e
           slot: System board or motherboard
           size: 64GiB
         *-bank:0
-             description: DIMM [empty]
+             description: [empty]
              physical id: 0
              slot: Controller0-ChannelA-DIMM0
         *-bank:1
-             description: DIMM Synchronous 4800 MHz (0,2 ns)
+             description: DIMM Synchronous 6400 MHz (0,2 ns)
              product: CMK64GX5M2B6400C32
              vendor: Corsair
              physical id: 1
@@ -40,13 +42,13 @@
              slot: Controller0-ChannelA-DIMM1
              size: 32GiB
              width: 64 bits
-             clock: 505MHz (2.0ns)
+             clock: 2105MHz (0.5ns)
         *-bank:2
-             description: DIMM [empty]
+             description: [empty]
              physical id: 2
              slot: Controller0-ChannelB-DIMM0
         *-bank:3
-             description: DIMM Synchronous 4800 MHz (0,2 ns)
+             description: DIMM Synchronous 6400 MHz (0,2 ns)
              product: CMK64GX5M2B6400C32
              vendor: Corsair
              physical id: 3
@@ -54,7 +56,7 @@
              slot: Controller0-ChannelB-DIMM1
              size: 32GiB
              width: 64 bits
-             clock: 505MHz (2.0ns)
+             clock: 2105MHz (0.5ns)
[…]

I haven’t checked if there are measurable performance differences (e.g. if the XMP profile is truly active), but at least you now know to not necessarily trust what lshw can show you.

CPU fan

I am a long-time fan of Noctua’s products: This company makes silent fans with great cooling capacity that work reliably! For many years, I have swapped out all the fans of each of my PCs with Noctua fans, and it was always an upgrade. Highly recommended.

Hence, it is no question that I picked the latest and greatest Noctua CPU cooler for this build: the Noctua NH-D15 G2. There are a couple of things to pay attention to with this cooler:

I decided to configure it with one fan instead of two fans: Using only one fan will be the quietest setup, yet still have plenty of cooling capacity for this setup.
There are 3 different versions that differ in how their base plate is shaped. Noctua recommends: “For LGA1851, we generally recommend the regular standard version with medium base convexity” (https://noctua.at/en/intel-lga1851-all-you-need-to-know)
With a height of 168 mm, this cooler fits well into the Fractal Define 7 Compact Black.

CPU and GPU: Idle Power vs. Peak Performance

CPU choice: Intel over AMD

Probably the point that raises most questions about this build is why I selected an Intel CPU over an AMD CPU. The primary reason is that Intel CPUs are so much better at power saving!

Let me explain: Most benchmarks online are for gamers and hence measure a usage curve that goes “start game, run PC at 100% resources for hours”. Of course, when you never let the machine idle, you would care about power efficiency: how much power do you need to use to achieve the desired result?

My use-case is software development, not gaming. My usage curve oscillates between “barely any usage because Michael is reading text” to “complete this compilation as quickly as possible with all the power available”. For me, I need both absolute power consumption at idle, and absolute performance to be best-of-class.

AMD’s CPUs offer great performance (the recently released Ryzen 9 9950X3D is even faster than the Intel 9 285K), and have great power efficiency, but poor power consumption at idle: With ≈35W of idle power draw, Zen 5 CPUs consume ≈3x as much power as Intel CPUs!

Intel’s CPUs offer great performance (like AMD), but excellent power consumption at idle.

Therefore, I can’t in good conscience buy an AMD CPU, but if you want a fast gaming-only PC or run an always-loaded HPC cluster with those CPUs, definitely go ahead :)

Graphics card: nvidia over AMD

I don’t necessarily recommend any particular nvidia graphics card, but I have had to stick to nvidia cards because they are the only option that work with my picky Dell UP3218K monitor.

From time to time, I try out different graphics cards. Recently, I got myself an AMD Radeon RX 9070 because I read that it works well with open source drivers.

While the Radeon RX 9070 works with my monitor (great!), it seems to consume 45W in idle, which is much higher than my nvidia cards, which idle at ≈ 20W. This is unacceptable to me: Aside from high power costs and wasting precious resources, the high power draw also means that my room will be hotter in summer and the fans need to spin faster and therefore louder.

People asked me on Social Media if this could be a measurement error (like, the card reporting inaccurate values), so I double-checked with a myStrom WiFi Switch and confirmed that with the Radeon card, the PC indeed draws 20-30W more from the wall socket.

Why Low Idle Power is so important

In the comments for my my previous blog post about the first build of this machine not running stable, people were asking why it is worth it to optimize a few watts of power usage. People calculate what higher power usage might cost, put it in relation to the total cost of the components, and conclude that saving ±10% of the price can’t possibly be worth the effort.

Let me try to illustrate the importance of low idle power with this anecdote: For one year, I was suffering from an nvidia driver bug that meant the GPU would not clock down to the most efficient power-saving mode (because of the high resolution of my monitor). The 10-20W of difference should have been insignificant. Yet, when the bug was fixed, I noticed how my PC got quieter (fans don’t need to spin up) and my room noticeably cooled down, which was great as it was peak temperatures in summer.

To me, having a whisper-quiet computing environment that does not heat up my room is a great, actual, real-life, measurable benefit. Not wasting resources and saving a tiny amount of money is a nice cherry on top.

Obviously all the factors are very dependent on your specific situation: Your house’s thermal behavior might differ from mine, your tolerance for noise (and/or baseline noise levels) might be different, you might put more/less weight on resource usage, etc.

Installation

UEFI setup

On the internet, I read that there was some issue related to the Power Limits that mainboards come with by default. Therefore, I did a UEFI firmware update immediately after getting the mainboard. I upgraded to version 1404 (2025/01/10) using the provided ZIP file (PRIME-Z890-P-ASUS-1404.zip) on an MS-DOS FAT-formatted USB stick with the EZ Flash tool in the UEFI firmware interface. Tip: do not extract the ZIP file, otherwise the EZ Flash tool cannot update the Intel ME firmware. Just put the ZIP file onto the USB disk as-is.

I verified that with this UEFI version, the Power Limit 1 (PL1) is 250W, and ICCMAX=347A, which are exactly the values that Intel recommends. Great!

I also enabled XMP and verified that memtest86 reported no errors.

Software setup: early adopter pains

To copy over the data from the old disk to the new disk, I wanted to boot a live linux distribution (specifically, grml.org) and follow my usual procedure: boot with the old disk and the new (empty) disk, then use dd to copy the data. It’s nice and simple, hard to screw up.

Unfortunately, while grml 2024.12 technically does boot up, there are two big problems:

There is no network connectivity because the kernel and linux-firmware versions are too old.
- Kernel commit r8169: add support for RTL8125D is not included.
I could not get Xorg to work at all. Not with the Intel integrated GPU, nor with the nvidia dedicated GPU. Not with nomodeset or any of the other options in the grml menu. This wasn’t merely a convenience problem: I needed to use gparted (the graphical version) for its partition moving/resizing support.

Ultimately, it was easier to upgrade my old PC to Linux 6.13 and linux-firmware 20250109, then put in the new disk and copy over the installation.

TRIM your SSDs

SSD disks can degrade over time, so it is essential that the Operating System tells the SSD firmware about freed-up blocks (for wear leveling). When using full-disk encryption, all involved layers need to have TRIM support enabled.

I think I saw the effect of an incorrectly configured TRIM setup in action back in 2022, when I copied my data from a Force MP600 to a WD Black SN850, which unexpectedly took many hours!

To make sure my disk has a long and healthy life, I double-checked that both periodic and continuous TRIM are enabled on my Arch Linux system: The fstab(5) file contains the discard option (and mount(8) lists the discard option), and fstrim.service ran within the last week:

systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab...
fstrim[779617]: /boot: 10.1 GiB (10799427584 bytes) trimmed on /dev/nvme0n1p1
fstrim[779617]: /: 1.8 TiB (2018906263552 bytes) trimmed on /dev/mapper/cryptroot
systemd[1]: fstrim.service: Deactivated successfully.

Speaking of copying data: the transfer from my WD Black SN850 to my Samsung 990 PRO ran at 856 MB/s and took about 40 minutes in total.

Performance

Here are the total times for a couple of typical workloads I run:

Workload	12900K (2022)	285K (2025)
build Go 1.24.3 (`cd src; ./make.bash`)	≈35s	≈26s
gokrazy/rsync tests (`make test`)	≈0.5s	≈0.4s
gokrazy UEFI test (`go test ./integration/...`)	≈30s	≈10s
gokrazy Linux compile (`gokr-rebuild-kernel -cross=arm64`)	3m 13s	2m 7s

The performance boost is great! Building Linux kernels a whole minute faster is really nice.

Stability issues

In March, I published an article about how the first build of this machine was not stable, in which you can read in detail about the various crashes I ran into.

Now, in early May, I know for sure that the CPU was defective, after a lengthy trouble-shooting in which I swapped out all the other parts of this PC, sent back the CPU and got a new one.

The CPU was the most annoying component to diagnose in this build because it’s an LGA 1851 socket and I don’t (yet) have any other machines which uses that same socket. AMD’s approach of sticking to each socket for a longer time would have been better in this situation.

Stress testing

When I published my earlier blog post about the PC being unstable, I did not really know how to reliably trigger the issue. Some compute-intensive tasks like running a Django test suite seemed to trigger the issue. I suspect that the problem somehow got worse, because when I started stress testing the machine, suddenly it would crash every time when building a Linux kernel.

That got me curious to see if other well-known CPU stress testers like Prime95 would show problems, and indeed: within seconds, Prime95 would report errors.

I figured I would use Prime95 as a quick signal: if it reports errors, the machine is faulty. This typically happens within seconds of starting Prime95.

If Prime95 reported no errors, I would use Linux kernel compilation as a slow signal: if I can successfully build a kernel, the machine is likely stable enough.

The specific setup I used is to run ./mprime -m, hit N (do not participate in distributed computation projects), then Enter a few times to confirm the defaults. Eventually, Prime95 starts calculating, which pushes the CPU to 100% usage (see the dstat(1) -like output by my gokrazy/stat implementation) and draws the expected ≈300W of power from the wall:

photo of running Prime95 to stress-test a CPU

In addition, I also ran MemTest86 for a few hours:

To be clear: I also successfully ran MemTest86 on the previous, unstable build, so only running MemTest86 is not good enough if you are dealing with a faulty CPU.

RMA timeline

Jan 15th: I receive the components for my new PC
- In January and February, the PC crashes occasionally.
Mar 4th: I switch back to my old PC and start writing my blog post
Mar 19th: I publish my blog post about the machine not being stable
- The online discussion does not result in any interesting tips or leads.
Mar 20th: I order the AsRock Z890 Pro-A mainboard to ensure the mainboard is OK
Mar 24th: the AsRock Z890 Pro-A arrives
Apr 5th (Sat): started an RMA for the CPU
- They ask me to send the CPU to orderflow, which is the merchant that fulfilled my order.
- Typically, I prefer buying directly at digitec, but many PC components seem to only be available from orderflow on digitec nowadays.
Apr 9th (Wed): package arrives at orderflow (digitec gave me a non-priority return label)
Apr 14th (Mon): I got the following mail from digitec’s customer support and had to explain that I have thoroughly diagnosed the CPU as defective (a link to my blog post was sufficient):

Händler hat dies beim Hersteller angemeldet und dieser hat folgende Fragen:

Um sicherzugehen, dass wir Sie richtig verstehen: Sie haben die CPU auf zwei verschiedenen Motherboards getestet und das gleiche Problem besteht weiterhin?

Könnten Sie uns mitteilen, welche Marke und welches Modell die beiden verwendeten Motherboards sind?

Wurde auf beiden Motherboards die neueste BIOS-Version verwendet?

Bestand das Problem von Anfang an oder trat es erst später auf?

Wurde der Prozessor übertaktet? (Bitte beachten Sie, dass durch Übertakten die Garantie erlischt.)

Apr 25th (Fri): orderflow hands the replacement CPU to Swiss Post
May 1st (Thu): the machine successfully passes stress tests; I start using it

In summary, I spent March without a working PC, but that was because I didn’t have much time to pursue the project. Then, I spent April without a working PC because RMA’ing an Intel CPU through digitec seems pretty slow. I would have wished for a little more trust and a replacement CPU right away.

Conclusion

What a rollercoaster and time sink this build was! I have never received a faulty-on-arrival CPU in my entire life before. How did the CPU I first received pass Intel’s quality control? Or did it pass QC, but was damaged in transport? I will probably never know.

From now on, I know to extensively stress test new PC builds for stability to detect such issues quicker. Should the CPU be faulty, unfortunately getting it replaced is a month-long process — it’s very annoying to have such a costly machine just gather dust for a month.

But, once the faulty component was replaced, this is my best PC build yet:

The case is the perfect size for the components and offers incredibly convenient access to all components throughout the entire lifecycle of this PC, including the troubleshooting period, and the later stages of its life when this PC will be rotated into its “lab machine” period before I sell it second-hand to someone who will hopefully use the machine for another few years.

The machine is quiet, draws little power (for such a powerful machine) and really packs a punch!

As usual, I run Linux on this PC and haven’t noticed any problems in my day-to-day usage. I use suspend-to-RAM multiple times a day without any issues.

I hope some of these details were interesting and useful to you in your own PC builds!

If you want to learn about which peripherals I use aside from my 8K monitor (e.g. the Kinesis Advantage keyboard, Logitech MX Ergo trackball, etc.), check out my post stapelberg uses this: my 2020 desk setup. I might publish an updated version at some point :)

I have recently started using the grobi program by Alexander Neumann again and was delighted to discover that it makes using my fiddly (but wonderful) Dell 32-inch 8K monitor (UP3218K) monitor much more convenient — I get a signal more quickly than with my previous, sleep-based approach.

Previously, when my PC woke up from suspend-to-RAM, there were two scenarios:

The monitor was connected. My sleep program would power on the monitor (if needed), sleep a little while and then run xrandr(1) to (hopefully) configure the monitor correctly.
The monitor was not connected, for example because it was still connected to my work PC.

In scenario ②, or if the one-shot configuration attempt in scenario ① fails, I would need to SSH in from a different computer and run xrandr manually so that the monitor would show a signal:

% DISPLAY=:0 xrandr \
  --output DP-4 --mode 3840x4320 --panning 0x0+0+0 \
  --output DP-2 --right-of DP-4 --mode 3840x4320 --panning 0x0+3840+0

Automatic monitor configuration with grobi

I have now completely solved this problem by creating the following ~/.config/grobi.conf file:

rules:
  - name: UP3218K

    outputs_connected: [DP-2, DP-4]

	# DP-4 is left, DP-2 is right
    configure_row:
        - DP-4@3840x4320
        - DP-2@3840x4320

    # atomic instructs grobi to only call xrandr once and configure all the
    # outputs. This does not always work with all graphic cards, but is
	# needed to successfully configure the UP3218K monitor.
    atomic: true

…and installing / enabling grobi (on Arch Linux) using:

% sudo pacman -S grobi
% systemctl --user enable --now grobi

Whenever grobi detects that my monitor is connected (it listens for X11 RandR output change events), it will run xrandr(1) to configure the monitor resolution and positioning.

To check what grobi is seeing/doing, you can use:

% systemctl --user status grobi
% journalctl --user -u grob

For example, on my system, I see:

grobi: 18:31:48.823765 outputs: [HDMI-0 (primary) DP-0 DP-1 DP-2 (connected) 3840x2160+ [DEL-16711-808727372-DELL UP3218K-D2HP805I043L] DP-3 DP-4 (connected) 3840x21>
grobi: 18:31:48.823783 new rule found: UP3218K
grobi: 18:31:48.823785 enable outputs: [DP-4@3840x4320 DP-2@3840x4320]
grobi: 18:31:48.823789 using one atomic call to xrandr
grobi: 18:31:48.823806 running command /usr/bin/xrandr xrandr --output DP-4 --mode 3840x4320 --output DP-2 --mode 3840x4320 --right-of DP-4
grobi: 18:31:49.285944 new RANDR change event received

Notably, the instructions for getting out of a bad state (no signal) are now to power off the monitor and power it back on again. This will result in RandR output change events, which will trigger grobi, which will run xrandr, which configures the monitor. Nice!

Why not autorandr?

No particular reason. I knew grobi.

If nothing else, grobi is written in Go, so it’s likely to keep working smoothly over the years.

Does grobi work on Wayland?

Probably not. There is no mention of Wayland over on the grobi repository.

Bonus: my Suspend-to-RAM setup

As a bonus, this section describes the other half of my monitor-related automation.

When I suspend my PC to RAM, I either want to wake it up manually later, for example by pressing a key on the keyboard or by sending a Wake-on-LAN packet, or I want it to wake up automatically each morning at 6:50 — that way, daily cron jobs have some time to run before I start using the computer.

To accomplish this, I use zleep, a wrapper program around rtcwake(8) and systemctl suspend that integrates with the myStrom switch smart plug to turn off power to the monitor entirely. This is worthwhile because the monitor draws 30W even in standby!

package main

import (
	"context"
	"flag"
	"fmt"
	"log"
	"net/http"
	"net/url"
	"os"
	"os/exec"
	"time"
)

var (
	resume = flag.Bool("resume",
		false,
		"run resume behavior only (turn on monitor via smart plug)")

	noMonitor = flag.Bool("no_monitor",
		false,
		"disable turning off/on monitor")
)

func monitorPower(ctx context.Context, method, cmnd string) error {
	if *noMonitor {
		log.Printf("[monitor power] skipping because -no_monitor flag is set")
		return nil
	}
	log.Printf("[monitor power] command: %v", cmnd)
	u, err := url.Parse("http://myStrom-Switch-A46FD0/" + cmnd)
	if err != nil {
		return err
	}
	for {
		if err := ctx.Err(); err != nil {
			return err
		}
		req, err := http.NewRequest(method, u.String(), nil)
		if err != nil {
			return err
		}
		ctx, canc := context.WithTimeout(ctx, 5*time.Second)
		defer canc()
		req = req.WithContext(ctx)
		resp, err := http.DefaultClient.Do(req)
		if err != nil {
			log.Print(err)
			time.Sleep(1 * time.Second)
			continue
		}
		if resp.StatusCode != http.StatusOK {
			log.Printf("unexpected HTTP status code: got %v, want %v", resp.Status, http.StatusOK)
			time.Sleep(1 * time.Second)
			continue
		}
		log.Printf("[monitor power] request succeeded")
		return nil
	}
}

func nextWakeup(now time.Time) time.Time {
	midnight := time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, time.Local)
	if now.Hour() < 6 {
		// wake up today
		return midnight.Add(6*time.Hour + 50*time.Minute)
	}

	// wake up tomorrow
	return midnight.Add(24 * time.Hour).Add(6*time.Hour + 50*time.Minute)
}

func runResume() error {
	// Retry for up to one minute to give the network some time to come up
	ctx, canc := context.WithTimeout(context.Background(), 1*time.Minute)
	defer canc()
	if err := monitorPower(ctx, "GET", "relay?state=1"); err != nil {
		log.Print(err)
	}
	return nil
}

func zleep() error {
	ctx := context.Background()

	now := time.Now().Truncate(1 * time.Second)
	wakeup := nextWakeup(now)
	log.Printf("now   : %v", now)
	log.Printf("wakeup: %v", wakeup)
	log.Printf("wakeup: %v (timestamp)", wakeup.Unix())

	// assumes hwclock is running in UTC (see timedatectl | grep local)

	// Power the monitor off in 15 seconds.
	// mode=on is intentional: https://api.mystrom.ch/#e532f952-36ea-40fb-a180-a57b835f550e
	// - the switch will be turned on (already on, so this is a no-op)
	// - the switch will wait for 15 seconds
	// - the switch will be turned off
	if err := monitorPower(ctx, "POST", "timer?mode=on&time=15"); err != nil {
		log.Print(err)
	}

	sleep := exec.Command("sh", "-c", fmt.Sprintf("sudo rtcwake -m no --verbose --utc -t %v && sudo systemctl suspend", wakeup.Unix()))
	sleep.Stdout = os.Stdout
	sleep.Stderr = os.Stderr
	fmt.Printf("running %v\n", sleep.Args)
	if err := sleep.Run(); err != nil {
		return fmt.Errorf("%v: %v", sleep.Args, err)
	}

	return nil
}

func main() {
	flag.Parse()
	if *resume {
		if err := runResume(); err != nil {
			log.Fatal(err)
		}
	} else {
		if err := zsleep(); err != nil {
			log.Fatal(err)
		}
	}
}

To turn power to the monitor on after resuming, I placed the following shell script in /lib/systemd/system-sleep/zleep.sh:

#!/bin/sh

case "$1" in
	pre)	exit 0
		;;
	post)	/usr/local/bin/zleep -resume
		exit 0
		;;
 	*)	exit 1
		;;
esac

Once power is on, grobi will detect and configure the monitor.

Here is the program in action:

2025/05/06 21:58:32 now   : 2025-05-06 21:58:32 +0200 CEST
2025/05/06 21:58:32 wakeup: 2025-05-07 06:50:00 +0200 CEST
2025/05/06 21:58:32 wakeup: 1746593400 (timestamp)
2025/05/06 21:58:32 [monitor power] command: timer?mode=on&time=15
2025/05/06 21:58:32 [monitor power] request succeeded
running [sh -c sudo rtcwake -m no --verbose --utc -t 1746593400 && sudo systemctl suspend]
Using UTC time.
	delta   = 0
	tzone   = 0
	tzname  = UTC
	systime = 1746561512, (UTC) Tue May  6 19:58:32 2025
	rtctime = 1746561512, (UTC) Tue May  6 19:58:32 2025
alarm 1746593400, sys_time 1746561512, rtc_time 1746561512, seconds 0
rtcwake: wakeup using /dev/rtc0 at Wed May  7 04:50:00 2025
suspend mode: no; leaving

In January I ordered the components for a new PC and expected that I would publish a successor to my 2022 high-end Linux PC 🐧 article. Instead, I am now sitting on a PC which regularly encounters crashes of the worst-to-debug kind, so I am publishing this article as a warning for others in case you wanted to buy the same hardware.

Components

Which components did I pick for this build? Here’s the full list:

Price	Type	Article
140 CHF	Case	Fractal Define 7 Compact Black Solid
155 CHF	Power Supply	Corsair RM850x
233 CHF	Mainboard	ASUS PRIME Z890-P
620 CHF	CPU	Intel Core Ultra 9 285k
120 CHF	CPU fan	Noctua NH-D15 G2
39 CHF	Case fan	Noctua NF-A14 PWM (140 mm)
209 CHF	RAM	64 GB DDR5-6400 Corsair Vengeance (2 x 32GB)
280 CHF	Disk	4000 GB Samsung 990 Pro
940 CHF	GPU	Inno3D GeForce RTX4070 Ti

Total: ≈1800 CHF, excluding the Graphics Card I re-used from a previous build.

…and the next couple of sections go into detail on how I selected these components.

Case

I really like building components into the case and working with the case. There are no sharp edges, the mechanisms are a pleasure to use and the cable-management is well thought-out.

Power Supply

I wanted to keep my options open regarding upgrading to an nVidia 50xx series graphics card at a later point. Those models have a TGP (“Total Graphics Power”) of 575 watts, so I needed a power supply that delivers enough power for the whole system even at peak power usage in all dimensions.

I ended up selecting the Corsair RM850x, which reviews favoribly (“leader in the 850W gold category”) and was available at my electronics store of choice.

This was a good choice: the PSU indeed runs quiet, and I really like the power cables (e.g. the GPU cable) that they include: they are very flexible, which makes them easy to cable-manage.

SSD disk

Instead, I picked the Samsung 990 Pro with 4 TB. I made good experiences with the Samsung Pro series over the years (never had one die or degrade performance), and my previous 2 TB disk is starting to fill up, so the extra storage space is appreciated.

Mainboard

In the 2.5 GbE space, the main players seem to be Realtek and Intel. Most mainboard vendors opted for Realtek as far as I could see.

Linux includes the r8169 driver for Realtek network cards, but you need a recent-enough Linux version (6.13+) that includes commit “r8169: add support for RTL8125D”, accompanied by a recent-enough linux-firmware package. Even then, there is some concern around stability and ASPM support. See for example this ServerFault post by someone working on the r8169 driver.

Despite the Intel 1 GbE options being well-supported at this point, Intel’s 2.5 GbE options might not fare any better than the Realtek ones: I found reports of instability with Intel’s 2.5 GbE network cards.

Aside from the network cards, I decided to stick to the ASUS prime series of mainboards, as I made good experiences with those in my past few builds. Here are a couple of thoughts on the ASUS PRIME Z890-P mainboard I went with:

I like the quick-release PCIe mechanism: ASUS understood that people had trouble unlocking large graphics cards from their PCIe slot, so they added a lever-like mechanism that is easily reachable. In my couple of usages, this worked pretty well!
I wrote about slow boot times with my 2022 PC build that were caused by time-consuming memory training. On this ASUS board, I noticed that they blink the Power LED to signal that memory training is in progress. Very nice! It hadn’t occurred to me previously that the various phases of the boot could be signaled by different Power LED blinking patterns :)
- The downside of this feature is: While the machine is in suspend-to-RAM, the Power LED also blinks! This is annoying, so I might just disconnect the Power LED entirely.
The UEFI firmware includes what they call a Q-Dashboard: An overview of what is installed/connected in which slot. Quite nice:

CPU fan

I am a long-time fan of Noctua’s products: This company makes silent fans with great cooling capacity that work reliably! For many years, I have swapped out every of my PC’s fans with Noctua fans, and it was always an upgrade. Highly recommended.

Hence, it is no question that I picked the latest and greatest Noctua CPU cooler for this build: the Noctua NH-D15 G2. There are a couple of things to pay attention to with this cooler:

I decided to configure it with one fan instead of two fans: Using only one fan will be the quietest setup, yet still have plenty of cooling capacity for this setup.
There are 3 different versions that differ in how their base plate is shaped. Noctua recommends: “For LGA1851, we generally recommend the regular standard version with medium base convexity” (https://noctua.at/en/intel-lga1851-all-you-need-to-know)
The height of this cooler is 168 mm. This fits well into the Fractal Define 7 Compact Black.

CPU

Probably the point that raises most questions about this build is why I selected an Intel CPU over an AMD CPU. The primary reason is that Intel CPUs are so much better at power saving!

Intel’s CPUs offer great performance (like AMD), but excellent power consumption at idle.

Therefore, I can’t in good conscience buy an AMD CPU, but if you want a fast gaming-only PC or run an always-loaded HPC cluster with those CPUs, definitely go ahead :)

Graphics card

I don’t necessarily recommend any particular nVidia graphics card, but I have had to stick to nVidia cards because they are the only option that work with my picky Dell UP3218K monitor.

From time to time, I try out different graphics cards. Recently, I got myself an AMD Radeon RX 9070 because I read that it works well with open source drivers.

While the Radeon RX 9070 works with my monitor (great!), it seems to consume 45W in idle, which is much higher than my nVidia cards, which idle at ≈ 20W. This is unacceptable to me: Aside from high power costs and wasting precious resources, the high power draw also means that my room will be hotter in summer and the fans need to spin faster and therefore louder.

Maybe I’ll write a separate article about the Radeon RX 9070.

Installation

UEFI setup

On the internet, I read that there was some issue related to the Power Limits that mainboards come with by default. Therefore, I did a UEFI firmware update first thing after getting the mainboard. I upgraded to version 1404 (2025/01/10) using the provided ZIP file (PRIME-Z890-P-ASUS-1404.zip) on an MS-DOS FAT-formatted USB stick with the EZ Flash tool in the UEFI firmware interface. Tip: do not extract the ZIP file, otherwise the EZ Flash tool cannot update the Intel ME firmware. Just put the ZIP file onto the USB disk as-is.

I verified that with this UEFI version, the Power Limit 1 (PL1) is 250W, and ICCMAX=347A, which are exactly the values that Intel recommends. Great!

I also enabled XMP and verified that memtest86 reported no errors.

Software setup: early adopter pains

Unfortunately, while grml 2024.12 technically does boot up, there are two big problems:

There is no network connectivity because the kernel and linux-firmware versions are too old.
- r8169: add support for RTL8125D
I could not get Xorg to work at all. Not with the Intel integrated GPU, nor with the nVidia dedicated GPU. Not with nomodeset or any of the other options in the grml menu. This wasn’t merely a convenience problem: I needed to use gparted (the graphical version) for its partition moving/resizing support.

Ultimately, it was easier to upgrade my old PC to Linux 6.13 and linux-firmware 20250109, then put in the new disk and copy over the installation.

Stability issues

At this point (early February), I switched to this new machine as my main PC.

Unfortunately, I could never get it to run stable! This journal shows you some of the issues I faced and what I tried to troubleshoot them.

Xorg dying after resume-from-suspend

One of the first issues I encountered with this system was that after resuming from suspend-to-RAM, I was greeted with a login window instead of my X11 session. The logs say:

(EE) NVIDIA(GPU-0): Failed to acquire modesetting permission.
(EE) Fatal server error:
(EE) EnterVT failed for screen 0
(EE) 
(EE) 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) 
(WW) NVIDIA(0): Failed to set the display configuration
(WW) NVIDIA(0):  - Setting a mode on head 0 failed: Insufficient permissions
(WW) NVIDIA(0):  - Setting a mode on head 1 failed: Insufficient permissions
(WW) NVIDIA(0):  - Setting a mode on head 2 failed: Insufficient permissions
(WW) NVIDIA(0):  - Setting a mode on head 3 failed: Insufficient permissions
(EE) Server terminated with error (1). Closing log file.

I couldn’t find any good tips online for this error message, so I figured I’d wait and see how frequently this happens before investigating further.

Feb 18: xHCI host controller dying

On Feb 18th, after resume-from-suspend, none of my USB peripherals would work anymore! This affected all USB ports of the machine and could not be fixed, not even by a reboot, until I fully killed power to the machine! In the kernel log, I saw the following messages:

xhci_hcd 0000:80:14.0: xHCI host not responding to stop endpoint command
xhci_hcd 0000:80:14.0: xHCI host controller not responding, assume dead
xhci_hcd 0000:80:14.0: HC died; cleaning up

Feb 24: xHCI host controller dying

The HC dying issue happened again when I was writing an SD card in my USB card reader:

xhci_hcd 0000:80:14.0: HC died; cleaning up

Feb 24: → UEFI update, disable XMPP

To try and fix the host controller dying issue, I updated the UEFI firmware to version 1601 and disabled the XMPP RAM profile.

Feb 26: → switch back from GeForce 4070 Ti to 3060 Ti

To rule out any GPU-specific issues, I decided to switch back from the Inno3D GeForce RTX4070 Ti to my older MSI GeForce RTX 3060 Ti.

Feb 28: PC dying on suspend-to-RAM

On Feb 28th, my PC did not resume from suspend-to-RAM. It would not even react to a ping, I had to hard-reset the machine. When checking the syslog afterwards, there are no entries.

I checked my power monitoring and saw that the machine consumed 50W (well above idle power, and far above suspend-to-RAM power) throughout the entire night. Hence, I suspect that the suspend-to-RAM did not work correctly and the machine never actually suspended.

Mar 4th: PC dying when running django tests

On March 4th, I was running the test suite for a medium-sized Django project (= 100% CPU usage) when I encountered a really hard crash: The machine stopped working entirely, meaning all peripherals like keyboard and mouse stopped responding, and the machine even did not respond to a network ping anymore.

At this point, I had enough and switched back to my 2022 PC.

Conclusion

What use is a computer that doesn’t work? My hierarchy of needs contains stability as the foundation, then speed and convenience. This machine exhausted my tolerance for frustration with its frequent crashes.

Manawyrm actually warned me about the ASUS board:

ASUS boards are a typical gamble as always – they fired their firmware engineers about 10 years ago, so you might get a nightmare of ACPI troubleshooting hell now (or it’ll just work). ASRock is worth a look as a replacement if that happens. Electronics are usually solid, though…

I didn’t expect that this PC would crash so hard, though. Like, if it couldn’t suspend/resume that would be one thing (a dealbreaker, but somewhat expected and understandable, probably fixable), but a machine that runs into a hard-lockup when compiling/testing software? No thanks.

I will buy a different mainboard to see if that helps, likely the ASRock Z890 Pro-A. If you have any recommendations for a Z890 mainboard that actually works reliably, please let me know!

Update 2025-04-17: I have received the ASRock Z890 Pro-A, but the machine shows exactly the same symptoms! I also swapped the power supply, which also did not help. Running Prime95 crashed almost immediately. At this point, I have to assume the CPU itself is defective and have started an RMA. I will post another update once (if?) I get a replaced CPU.

Update 2025-05-11: The CPU was faulty indeed! See My 2025 high-end Linux PC for a new article on this build, now with a working CPU.

I was helping someone get my gokrazy/rsync implementation set up to synchronize RPKI data (used for securing BGP routing infrastructure), when we discovered that with the right invocation, my rsync receiver would just hang indefinitely.

This was a quick problem to solve, but in the process, I realized that I should probably write down a few Go debugging tips I have come to appreciate over the years!

Scenario: hanging Go program

If you want to follow along, you can reproduce the issue by building an older version of gokrazy/rsync, just before the bug fix commit (you’ll need Go 1.22 or newer):

git clone https://github.com/gokrazy/rsync
cd rsync
git reset --hard 6c89d4dda3be055f19684c0ed56d623da458194e^
go install ./cmd/...

Now we can try to sync the repository:

% gokr-rsync \
  -rtO \
  --delete \
  rsync://rsync.paas.rpki.ripe.net/repository/ \
  /tmp/rpki-repo
[…]
2025/02/08 09:35:10 Opening TCP connection to rsync.paas.rpki.ripe.net:873
2025/02/08 09:35:10 rsync module "repo", path "repo/"
2025/02/08 09:35:10 (Client) Protocol versions: remote=31, negotiated=27
2025/02/08 09:35:10 Client checksum: md4
2025/02/08 09:35:10 sending daemon args: [--server --sender -tr . repo/]
2025/02/08 09:35:10 exclusion list sent
2025/02/08 09:35:10 receiving file list
2025/02/08 09:35:11 [Receiver] i=0 ? . mode=40755 len=4096 uid=0 gid=0 flags=?
[…]
2025/02/08 09:35:11 [Receiver] i=89 ? clonoth/1/3139332e33322e3130302e302f32342d3234203d3e203537313936.roa mode=100644 len=1747 uid=0 gid=0 flags=?

…and then the program just sits there.

Tip 1: Press Ctrl+\ (SIGQUIT) to print a stack trace

The easiest way to look at where a Go program is hanging is to press Ctrl+\ (backslash) to make the terminal send it a SIGQUIT signal. When the Go runtime receives SIGQUIT, it prints a stack trace to the terminal before exiting the process. This behavior is enabled by default and can be customized via the GOTRACEBACK environment variable, see the runtime package docs.

Here is what the output looks like in our case. I have made the font small so that you can recognize the shape of the output (the details are not important, continue reading below):

^\SIGQUIT: quit
PC=0x47664e m=0 sigcode=128

goroutine 0 gp=0x6e6020 m=0 mp=0x6e6ec0 [idle]:
internal/runtime/syscall.Syscall6()
	/home/michael/sdk/go1.23.0/src/internal/runtime/syscall/asm_linux_amd64.s:36 +0xe fp=0x7ffc58665090 sp=0x7ffc58665088 pc=0x47664e
internal/runtime/syscall.EpollWait(0x586651e0?, {0x7ffc5866511c?, 0x3000000018?, 0x7ffc586651f0?}, 0x58665110?, 0x7ffc?)
	/home/michael/sdk/go1.23.0/src/internal/runtime/syscall/syscall_linux.go:32 +0x45 fp=0x7ffc586650e0 sp=0x7ffc58665090 pc=0x4765e5
runtime.netpoll(0xc0000000c0?)
	/home/michael/sdk/go1.23.0/src/runtime/netpoll_epoll.go:116 +0xd2 fp=0x7ffc58665768 sp=0x7ffc586650e0 pc=0x432332
runtime.findRunnable()
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:3580 +0x8c5 fp=0x7ffc586658e0 sp=0x7ffc58665768 pc=0x43f045
runtime.schedule()
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:3995 +0xb1 fp=0x7ffc58665918 sp=0x7ffc586658e0 pc=0x4405b1
runtime.park_m(0xc0000061c0)
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:4102 +0x1eb fp=0x7ffc58665970 sp=0x7ffc58665918 pc=0x4409cb
runtime.mcall()
	/home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:459 +0x4e fp=0x7ffc58665988 sp=0x7ffc58665970 pc=0x470e2e

goroutine 1 gp=0xc0000061c0 m=nil [IO wait]:
runtime.gopark(0x452658?, 0x0?, 0x98?, 0xb3?, 0xb?)
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:424 +0xce fp=0xc0000eb358 sp=0xc0000eb338 pc=0x46bc0e
runtime.netpollblock(0x4a01b8?, 0x4058e6?, 0x0?)
	/home/michael/sdk/go1.23.0/src/runtime/netpoll.go:575 +0xf7 fp=0xc0000eb390 sp=0xc0000eb358 pc=0x4318f7
internal/poll.runtime_pollWait(0x7ef586628808, 0x72)
	/home/michael/sdk/go1.23.0/src/runtime/netpoll.go:351 +0x85 fp=0xc0000eb3b0 sp=0xc0000eb390 pc=0x46af05
internal/poll.(*pollDesc).wait(0xc0000ce180?, 0xc00020e99c?, 0x0)
	/home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0000eb3d8 sp=0xc0000eb3b0 pc=0x4b0ce7
internal/poll.(*pollDesc).waitRead(...)
	/home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0000ce180, {0xc00020e99c, 0x4, 0x4})
	/home/michael/sdk/go1.23.0/src/internal/poll/fd_unix.go:165 +0x27a fp=0xc0000eb470 sp=0xc0000eb3d8 pc=0x4b17da
net.(*netFD).Read(0xc0000ce180, {0xc00020e99c?, 0x6eeea0?, 0x1?})
	/home/michael/sdk/go1.23.0/src/net/fd_posix.go:55 +0x25 fp=0xc0000eb4b8 sp=0xc0000eb470 pc=0x4f7e85
net.(*conn).Read(0xc000206000, {0xc00020e99c?, 0xc000212000?, 0x6e6ec0?})
	/home/michael/sdk/go1.23.0/src/net/net.go:189 +0x45 fp=0xc0000eb500 sp=0xc0000eb4b8 pc=0x5001a5
net.(*TCPConn).Read(0x0?, {0xc00020e99c?, 0xc0000eb568?, 0x46d449?})
	<autogenerated>:1 +0x25 fp=0xc0000eb530 sp=0xc0000eb500 pc=0x50bb25
io.ReadAtLeast({0x5d9640, 0xc000206000}, {0xc00020e99c, 0x4, 0x4}, 0x4)
	/home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb578 sp=0xc0000eb530 pc=0x4957d0
io.ReadFull(...)
	/home/michael/sdk/go1.23.0/src/io/io.go:354
encoding/binary.Read({0x5d9640, 0xc000206000}, {0x5da8b0, 0x7059a0}, {0x55e7c0, 0xc0000eb6a0})
	/home/michael/sdk/go1.23.0/src/encoding/binary/binary.go:244 +0xa5 fp=0xc0000eb670 sp=0xc0000eb578 pc=0x5102a5
github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).ReadMsg(0xc00020a100)
	/home/michael/kr/rsync/internal/rsyncwire/wire.go:50 +0x48 fp=0xc0000eb6e8 sp=0xc0000eb670 pc=0x514428
github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).Read(0x7ef5869b9a68?, {0xc000280000, 0x40000, 0x4dd4fb?})
	/home/michael/kr/rsync/internal/rsyncwire/wire.go:72 +0x2f fp=0xc0000eb788 sp=0xc0000eb6e8 pc=0x5145af
bufio.(*Reader).Read(0xc0002020c0, {0xc00020e998, 0x4, 0x40ece5?})
	/home/michael/sdk/go1.23.0/src/bufio/bufio.go:241 +0x197 fp=0xc0000eb7c0 sp=0xc0000eb788 pc=0x4d5a57
io.ReadAtLeast({0x5d93e0, 0xc0002020c0}, {0xc00020e998, 0x4, 0x4}, 0x4)
	/home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb808 sp=0xc0000eb7c0 pc=0x4957d0
io.ReadFull(...)
	/home/michael/sdk/go1.23.0/src/io/io.go:354
github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32(0xc000208060)
	/home/michael/kr/rsync/internal/rsyncwire/wire.go:163 +0x4a fp=0xc0000eb850 sp=0xc0000eb808 pc=0x51490a
github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1(0xc000202120, 0x5a9b58)
	/home/michael/kr/rsync/internal/receiver/uidlist.go:16 +0x3d fp=0xc0000eb8c0 sp=0xc0000eb850 pc=0x51fc7d
github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList(0xc000202120)
	/home/michael/kr/rsync/internal/receiver/uidlist.go:52 +0x1dd fp=0xc0000eba08 sp=0xc0000eb8c0 pc=0x51ffbd
github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList(0xc000202120)
	/home/michael/kr/rsync/internal/receiver/flist.go:229 +0x378 fp=0xc0000ebb10 sp=0xc0000eba08 pc=0x51c5b8
github.com/gokrazy/rsync/internal/receivermaincmd.clientRun({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ef53d47efc8, 0xc000206000}, {0x7ffc5866600e, ...}, ...)
	/home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341 +0x5cd fp=0xc0000ebc10 sp=0xc0000ebb10 pc=0x550c2d
github.com/gokrazy/rsync/internal/receivermaincmd.socketClient({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ffc58665ff4?, 0x1?}, {0x7ffc5866600e, ...})
	/home/michael/kr/rsync/internal/receivermaincmd/clientserver.go:44 +0x425 fp=0xc0000ebcd0 sp=0xc0000ebc10 pc=0x54c205
github.com/gokrazy/rsync/internal/receivermaincmd.rsyncMain({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0xc00007e440, 0x1, 0x2}, ...)
	/home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:160 +0x5d7 fp=0xc0000ebdf0 sp=0xc0000ebcd0 pc=0x54f697
github.com/gokrazy/rsync/internal/receivermaincmd.Main({0xc0000160a0, 0x5, 0x5}, {0x5d9280?, 0xc000078058?}, {0x5d92a0?, 0xc000078060?}, {0x5d92a0?, 0xc000078068?})
	/home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:394 +0x272 fp=0xc0000ebee8 sp=0xc0000ebdf0 pc=0x5510d2
main.main()
	/home/michael/kr/rsync/cmd/gokr-rsync/rsync.go:12 +0x4e fp=0xc0000ebf50 sp=0xc0000ebee8 pc=0x5515ae
runtime.main()
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:272 +0x28b fp=0xc0000ebfe0 sp=0xc0000ebf50 pc=0x438d4b
runtime.goexit({})
	/home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ebfe8 sp=0xc0000ebfe0 pc=0x472e61

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:424 +0xce fp=0xc000074fa8 sp=0xc000074f88 pc=0x46bc0e
runtime.goparkunlock(...)
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:430
runtime.forcegchelper()
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:337 +0xb3 fp=0xc000074fe0 sp=0xc000074fa8 pc=0x439093
runtime.goexit({})
	/home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x472e61
created by runtime.init.7 in goroutine 1
	/home/michael/sdk/go1.23.0/src/runtime/proc.go:325 +0x1a

Phew! This output is pretty dense.

We can use the https://github.com/maruel/panicparse program to present this stack trace in a more colorful and much shorter version:

The functions helpfully highlighted in red are where the problem lies: My rsync receiver implementation was incorrectly expecting the server to send a uid/gid list, despite the PreserveUid and PreserveGid options not being enabled. Commit 6c89d4d fixes the issue.

Tip 2: Attach the delve debugger to the process

If dumping the stack trace in the moment is not sufficient to diagnose the problem, you can go one step further and reach for an interactive debugger.

The most well-known Linux debugger is probably GDB, but when working with Go, I recommend using the delve debugger instead as it typically works better. Install delve if you haven’t already:

% go install github.com/go-delve/delve/cmd/dlv@latest

In this article, I am using delve v1.24.0.

Note: If you want to explore local variables, you should rebuild your program without optimizations and inlining (see the dlv exec docs):

% go install -gcflags=all="-N -l" ./cmd/...

While you can run a new child process in a debugger (use dlv exec) without any special permissions, attaching existing processes in a debugger is disabled by default in Linux for security reasons. We can allow this feature (remember to turn it off later!) using:

% sudo sysctl -w kernel.yama.ptrace_scope=0
kernel.yama.ptrace_scope = 0

…and then we can just dlv attach to the hanging gokr-rsync process:

% dlv attach $(pidof gokr-rsync)
Type 'help' for list of commands.
(dlv)

Great. But if we just print a stack trace, we only see functions from the runtime package:

(dlv) bt
0  0x000000000047bb83 in runtime.futex
   at /home/michael/sdk/go1.23.6/src/runtime/sys_linux_amd64.s:558
1  0x00000000004374d0 in runtime.futexsleep
   at /home/michael/sdk/go1.23.6/src/runtime/os_linux.go:69
2  0x000000000040d89d in runtime.notesleep
   at /home/michael/sdk/go1.23.6/src/runtime/lock_futex.go:170
3  0x000000000044123e in runtime.mPark
   at /home/michael/sdk/go1.23.6/src/runtime/proc.go:1866
4  0x000000000044290d in runtime.stopm
   at /home/michael/sdk/go1.23.6/src/runtime/proc.go:2886
5  0x00000000004433d0 in runtime.findRunnable
   at /home/michael/sdk/go1.23.6/src/runtime/proc.go:3623
6  0x0000000000444e1d in runtime.schedule
   at /home/michael/sdk/go1.23.6/src/runtime/proc.go:3996
7  0x00000000004451cb in runtime.park_m
   at /home/michael/sdk/go1.23.6/src/runtime/proc.go:4103
8  0x0000000000477eee in runtime.mcall
   at /home/michael/sdk/go1.23.6/src/runtime/asm_amd64.s:459

The reason is that no goroutine is running (the program is waiting indefinitely to receive data from the server), so we see one of the OS threads waiting in the Go scheduler.

We first need to switch to the goroutine we are interested in (grs prints all goroutines), and then the stack trace looks like what we expect:

(dlv) gr 1
Switched from 0 to 1 (thread 414327)
(dlv) bt
 0  0x0000000000474ebc in runtime.gopark
    at /home/michael/sdk/go1.23.6/src/runtime/proc.go:425
 1  0x000000000043819e in runtime.netpollblock
    at /home/michael/sdk/go1.23.6/src/runtime/netpoll.go:575
 2  0x000000000047435c in internal/poll.runtime_pollWait
    at /home/michael/sdk/go1.23.6/src/runtime/netpoll.go:351
 3  0x00000000004ed15a in internal/poll.(*pollDesc).wait
    at /home/michael/sdk/go1.23.6/src/internal/poll/fd_poll_runtime.go:84
 4  0x00000000004ed1f1 in internal/poll.(*pollDesc).waitRead
    at /home/michael/sdk/go1.23.6/src/internal/poll/fd_poll_runtime.go:89
 5  0x00000000004ee351 in internal/poll.(*FD).Read
    at /home/michael/sdk/go1.23.6/src/internal/poll/fd_unix.go:165
 6  0x0000000000569bb3 in net.(*netFD).Read
    at /home/michael/sdk/go1.23.6/src/net/fd_posix.go:55
 7  0x000000000057a025 in net.(*conn).Read
    at /home/michael/sdk/go1.23.6/src/net/net.go:189
 8  0x000000000058fcc5 in net.(*TCPConn).Read
    at <autogenerated>:1
 9  0x00000000004b72e8 in io.ReadAtLeast
    at /home/michael/sdk/go1.23.6/src/io/io.go:335
10  0x00000000004b74d3 in io.ReadFull
    at /home/michael/sdk/go1.23.6/src/io/io.go:354
11  0x0000000000598d5f in encoding/binary.Read
    at /home/michael/sdk/go1.23.6/src/encoding/binary/binary.go:244
12  0x00000000005a0b7a in github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).ReadMsg
    at /home/michael/kr/rsync/internal/rsyncwire/wire.go:50
13  0x00000000005a0f17 in github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).Read
    at /home/michael/kr/rsync/internal/rsyncwire/wire.go:72
14  0x0000000000528de8 in bufio.(*Reader).Read
    at /home/michael/sdk/go1.23.6/src/bufio/bufio.go:241
15  0x00000000004b72e8 in io.ReadAtLeast
    at /home/michael/sdk/go1.23.6/src/io/io.go:335
16  0x00000000004b74d3 in io.ReadFull
    at /home/michael/sdk/go1.23.6/src/io/io.go:354
17  0x00000000005a19ef in github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32
    at /home/michael/kr/rsync/internal/rsyncwire/wire.go:163
18  0x00000000005b77d2 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1
    at /home/michael/kr/rsync/internal/receiver/uidlist.go:16
19  0x00000000005b7ea8 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList
    at /home/michael/kr/rsync/internal/receiver/uidlist.go:52
20  0x00000000005b18db in github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList
    at /home/michael/kr/rsync/internal/receiver/flist.go:229
21  0x0000000000605390 in github.com/gokrazy/rsync/internal/receivermaincmd.clientRun
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341
22  0x00000000005fe572 in github.com/gokrazy/rsync/internal/receivermaincmd.socketClient
    at /home/michael/kr/rsync/internal/receivermaincmd/clientserver.go:44
23  0x0000000000602f10 in github.com/gokrazy/rsync/internal/receivermaincmd.rsyncMain
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:160
24  0x0000000000605e7e in github.com/gokrazy/rsync/internal/receivermaincmd.Main
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:394
25  0x0000000000606653 in main.main
    at /home/michael/kr/rsync/cmd/gokr-rsync/rsync.go:12
26  0x000000000043fa47 in runtime.main
    at /home/michael/sdk/go1.23.6/src/runtime/proc.go:272
27  0x000000000047bd01 in runtime.goexit
    at /home/michael/sdk/go1.23.6/src/runtime/asm_amd64.s:1700

Tip 3: Save a core dump for later

If you don’t have time to poke around in the debugger now, you can save a core dump for later.

In addition to printing the stack trace on SIGQUIT, we can make the Go runtime crash the program, which in turn makes the Linux kernel write a core dump, by running our program with the environment variable GOTRACEBACK=crash.

Modern Linux systems typically include systemd-coredump(8) (but you might need to explicitly install it, for example on Ubuntu) to collect core dumps (and remove old ones). You can use coredumpctl(1) to list and work with them. On macOS, collecting cores is more involved. I don’t know about Windows.

In case your Linux system does not use systemd-coredump, you can use ulimit -c unlimited and set the kernel’s kernel.core_pattern sysctl setting. You can find more details and options in the CoreDumpDebugging page of the Go wiki. For this article, we will stick to coredumpctl:

% GOTRACEBACK=crash gokr-rsync -rtO --delete rsync://rsync.paas.rpki.ripe.net/repo/ /tmp/rpki-repo
[…]
^\SIGQUIT: quit
[…]
zsh: IOT instruction (core dumped)  GOTRACEBACK=crash gokr-rsync -rtO […]

The last line is what we want to see: it should say “core dumped”.

This core should now show up in coredumpctl(1) :

% coredumpctl info
           PID: 414607 (gokr-rsync)
           UID: 1000 (michael)
           GID: 1000 (michael)
        Signal: 6 (ABRT)
     Timestamp: Sat 2025-02-08 10:18:27 CET (12s ago)
  Command Line: gokr-rsync -rtO --delete rsync://rsync.paas.rpki.ripe.net/repo/ /tmp/rpki-repo
    Executable: /bin/gokr-rsync
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (michael)
       Boot ID: 6158dd3b52af4b8384c103a8a336fc02
    Machine ID: ecb5a44f1a5846ad871566e113bf8937
      Hostname: midna
       Storage: /var/lib/systemd/coredump/core.gokr-rsync.1000.6158dd3b52af4b8384c103a8a336fc02.414607.1739006307000000.zst (present)
  Size on Disk: 158.3K
       Message: Process 414607 (gokr-rsync) of user 1000 dumped core.
                
    Module [dso] without build-id.
    Module [dso]
    Stack trace of thread 1604447:
    #0  0x0000000000475a41 runtime.raise.abi0 (/bin/gokr-rsync + 0x75a41)
    #1  0x0000000000451d85 runtime.dieFromSignal (/bin/gokr-rsync + 0x51d85)
    #2  0x00000000004522e6 runtime.sigfwdgo (/bin/gokr-rsync + 0x522e6)
    #3  0x0000000000450c45 runtime.sigtrampgo (/bin/gokr-rsync + 0x50c45)
    #4  0x0000000000475d26 runtime.sigtramp.abi0 (/bin/gokr-rsync + 0x75d26)
    #5  0x0000000000475e20 n/a (/bin/gokr-rsync + 0x75e20)
    ELF object binary architecture: AMD x86-64

If you see only hexadecimal addresses followed by n/a (n/a + 0x0), that means systemd-coredump could not symbolize (= resolve addresses to function names) your core dump. Here are a few possible reasons for missing symbolization:

Linux 6.12 and 6.13 produced core dumps that elfutils cannot symbolize. systemd-coredump uses elfutils for symbolization, so avoid 6.12/6.13 in favor of using 6.14 or newer.
With systemd v234-v256, systemd-coredump did not have permission to look into programs living in the /home directory (fixed with commit 4ac1755 in systemd v257+).
- Similarly, systemd-coredump runs with PrivateTmp=yes, meaning it won’t be able to access programs you place in /tmp.
Go builds with debug symbols by default, but maybe you are explicitly stripping debug symbols in your build, by building with -ldflags=-w?

We can now use coredumpctl(1) to launch delve for this program + core dump:

% coredumpctl debug --debugger=dlv --debugger-arguments=core
[…]
Type 'help' for list of commands.
(dlv) gr 1
Switched from 0 to 1 (thread 414607)
(dlv) bt
[…]
16  0x00000000004b74d3 in io.ReadFull
    at /home/michael/sdk/go1.23.6/src/io/io.go:354
17  0x00000000005a19ef in github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32
    at /home/michael/kr/rsync/internal/rsyncwire/wire.go:163
18  0x00000000005b77d2 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1
    at /home/michael/kr/rsync/internal/receiver/uidlist.go:16
19  0x00000000005b7ea8 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList
    at /home/michael/kr/rsync/internal/receiver/uidlist.go:52
20  0x00000000005b18db in github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList
    at /home/michael/kr/rsync/internal/receiver/flist.go:229
21  0x0000000000605390 in github.com/gokrazy/rsync/internal/receivermaincmd.clientRun
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341
22  0x00000000005fe572 in github.com/gokrazy/rsync/internal/receivermaincmd.socketClient
    at /home/michael/kr/rsync/internal/receivermaincmd/clientserver.go:44
23  0x0000000000602f10 in github.com/gokrazy/rsync/internal/receivermaincmd.rsyncMain
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:160
24  0x0000000000605e7e in github.com/gokrazy/rsync/internal/receivermaincmd.Main
    at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:394
25  0x0000000000606653 in main.main
    at /home/michael/kr/rsync/cmd/gokr-rsync/rsync.go:12
26  0x000000000043fa47 in runtime.main
    at /home/michael/sdk/go1.23.6/src/runtime/proc.go:272
27  0x000000000047bd01 in runtime.goexit
    at /home/michael/sdk/go1.23.6/src/runtime/asm_amd64.s:1700

Conclusion

In my experience, in the medium to long term, it always pays off to set up your environment such that you can debug your programs conveniently. I strongly encourage every programmer (and even users!) to invest time into your development and debugging setup.

Luckily, Go comes with stack printing functionality by default (just press Ctrl+\) and we can easily get a core dump out of our Go programs by running them with GOTRACEBACK=crash — provided the system is set up to collect core dumps.

Together with the delve debugger, this gives us all we need to effectively and efficiently diagnose problems in Go programs.

[Protocol Buffers (Protobuf) is Google’s language-neutral data interchange format. See protobuf.dev.]

Back in March 2020, we released a major overhaul of the Go Protobuf API. The google.golang.org/protobuf package introduced first-class support for reflection, a dynamicpb implementation and the protocmp package for easier testing.

That release introduced a new protobuf module with a new API. Today, we are releasing an additional API for generated code, meaning the Go code in the .pb.go files created by the protocol compiler (protoc). This blog post explains our motivation for creating a new API and shows you how to use it in your projects.

To be clear: We are not removing anything. We will continue to support the existing API for generated code, just like we still support the older protobuf module (by wrapping the google.golang.org/protobuf implementation). Go is committed to backwards compatibility and this applies to Go Protobuf, too!

Background: the (existing) Open Struct API

We now call the existing API the Open Struct API, because generated struct types are open to direct access. In the next section, we will see how it differs from the new Opaque API.

To work with protocol buffers, you first create a .proto definition file like this one:

edition = "2023";  // successor to proto2 and proto3

package log;

message LogEntry {
  string backend_server = 1;
  uint32 request_size = 2;
  string ip_address = 3;
}

Then, you run the protocol compiler (protoc) to generate code like the following (in a .pb.go file):

package logpb

type LogEntry struct {
  BackendServer *string
  RequestSize   *uint32
  IPAddress     *string
  // …internal fields elided…
}

func (l *LogEntry) GetBackendServer() string { … }
func (l *LogEntry) GetRequestSize() uint32   { … }
func (l *LogEntry) GetIPAddress() string     { … }

Now you can import the generated logpb package from your Go code and call functions like proto.Marshal to encode logpb.LogEntry messages into protobuf wire format.

You can find more details in the Generated Code API documentation.

(Existing) Open Struct API: Field Presence

An important aspect of this generated code is how field presence (whether a field is set or not) is modeled. For instance, the above example models presence using pointers, so you could set the BackendServer field to:

proto.String("zrh01.prod"): the field is set and contains “zrh01.prod”
proto.String(""): the field is set (non-nil pointer) but contains an empty value
nil pointer: the field is not set

If you are used to generated code not having pointers, you are probably using .proto files that start with syntax = "proto3". The field presence behavior changed over the years:

syntax = "proto2" uses explicit presence by default
syntax = "proto3" used implicit presence by default (where cases 2 and 3 cannot be distinguished and are both represented by an empty string), but was later extended to allow opting into explicit presence with the optional keyword
edition = "2023", the successor to both proto2 and proto3, uses explicit presence by default

The new Opaque API

We created the new Opaque API to uncouple the Generated Code API from the underlying in-memory representation. The (existing) Open Struct API has no such separation: it allows programs direct access to the protobuf message memory. For example, one could use the flag package to parse command-line flag values into protobuf message fields:

var req logpb.LogEntry
flag.StringVar(&req.BackendServer, "backend", os.Getenv("HOST"), "…")
flag.Parse() // fills the BackendServer field from -backend flag

The problem with such a tight coupling is that we can never change how we lay out protobuf messages in memory. Lifting this restriction enables many implementation improvements, which we’ll see below.

What changes with the new Opaque API? Here is how the generated code from the above example would change:

package logpb

type LogEntry struct {
  xxx_hidden_BackendServer *string // no longer exported
  xxx_hidden_RequestSize   uint32  // no longer exported
  xxx_hidden_IPAddress     *string // no longer exported
  // …internal fields elided…
}

func (l *LogEntry) GetBackendServer() string { … }
func (l *LogEntry) HasBackendServer() bool   { … }
func (l *LogEntry) SetBackendServer(string)  { … }
func (l *LogEntry) ClearBackendServer()      { … }
// …

With the Opaque API, the struct fields are hidden and can no longer be directly accessed. Instead, the new accessor methods allow for getting, setting, or clearing a field.

Opaque structs use less memory

One change we made to the memory layout is to model field presence for elementary fields more efficiently:

The (existing) Open Struct API uses pointers, which adds a 64-bit word to the space cost of the field.
The Opaque API uses bit fields, which require one bit per field (ignoring padding overhead).

Using fewer variables and pointers also lowers load on the allocator and on the garbage collector.

The performance improvement depends heavily on the shapes of your protocol messages: The change only affects elementary fields like integers, bools, enums, and floats, but not strings, repeated fields, or submessages (because it is less profitable for those types).

Our benchmark results show that messages with few elementary fields exhibit performance that is as good as before, whereas messages with more elementary fields are decoded with significantly fewer allocations:

             │ Open Struct API │             Opaque API             │
             │    allocs/op    │  allocs/op   vs base               │
Prod#1          360.3k ± 0%       360.3k ± 0%  +0.00% (p=0.002 n=6)
Search#1       1413.7k ± 0%       762.3k ± 0%  -46.08% (p=0.002 n=6)
Search#2        314.8k ± 0%       132.4k ± 0%  -57.95% (p=0.002 n=6)

Reducing allocations also makes decoding protobuf messages more efficient:

             │ Open Struct API │             Opaque API            │
             │   user-sec/op   │ user-sec/op  vs base              │
Prod#1         55.55m ± 6%        55.28m ± 4%  ~ (p=0.180 n=6)
Search#1       324.3m ± 22%       292.0m ± 6%  -9.97% (p=0.015 n=6)
Search#2       67.53m ± 10%       45.04m ± 8%  -33.29% (p=0.002 n=6)

(All measurements done on an AMD Castle Peak Zen 2. Results on ARM and Intel CPUs are similar.)

Note: proto3 with implicit presence similarly does not use pointers, so you will not see a performance improvement if you are coming from proto3. If you were using implicit presence for performance reasons, forgoing the convenience of being able to distinguish empty fields from unset ones, then the Opaque API now makes it possible to use explicit presence without a performance penalty.

Motivation: Lazy Decoding

Lazy decoding is a performance optimization where the contents of a submessage are decoded when first accessed instead of during proto.Unmarshal. Lazy decoding can improve performance by avoiding unnecessarily decoding fields which are never accessed.

Lazy decoding can’t be supported safely by the (existing) Open Struct API. While the Open Struct API provides getters, leaving the (un-decoded) struct fields exposed would be extremely error-prone. To ensure that the decoding logic runs immediately before the field is first accessed, we must make the field private and mediate all accesses to it through getter and setter functions.

This approach made it possible to implement lazy decoding with the Opaque API. Of course, not every workload will benefit from this optimization, but for those that do benefit, the results can be spectacular: We have seen logs analysis pipelines that discard messages based on a top-level message condition (e.g. whether backend_server is one of the machines running a new Linux kernel version) and can skip decoding deeply nested subtrees of messages.

As an example, here are the results of the micro-benchmark we included, demonstrating how lazy decoding saves over 50% of the work and over 87% of allocations!

                  │   nolazy    │                lazy                │
                  │   sec/op    │   sec/op     vs base               │
Unmarshal/lazy-24   6.742µ ± 0%   2.816µ ± 0%  -58.23% (p=0.002 n=6)

                  │    nolazy    │                lazy                 │
                  │     B/op     │     B/op      vs base               │
Unmarshal/lazy-24   3.666Ki ± 0%   1.814Ki ± 0%  -50.51% (p=0.002 n=6)

                  │   nolazy    │               lazy                │
                  │  allocs/op  │ allocs/op   vs base               │
Unmarshal/lazy-24   64.000 ± 0%   8.000 ± 0%  -87.50% (p=0.002 n=6)

Motivation: reduce pointer comparison mistakes

Modeling field presence with pointers invites pointer-related bugs.

Consider an enum, declared within the LogEntry message:

message LogEntry {
  enum DeviceType {
    DESKTOP = 0;
    MOBILE = 1;
    VR = 2;
  };
  DeviceType device_type = 1;
}

A simple mistake is to compare the device_type enum field like so:

if cv.DeviceType == logpb.LogEntry_DESKTOP.Enum() { // incorrect!

Did you spot the bug? The condition compares the memory address instead of the value. Because the Enum() accessor allocates a new variable on each call, the condition can never be true. The check should have read:

if cv.GetDeviceType() == logpb.LogEntry_DESKTOP {

The new Opaque API prevents this mistake: Because fields are hidden, all access must go through the getter.

Motivation: reduce accidental sharing mistakes

Let’s consider a slightly more involved pointer-related bug. Assume you are trying to stabilize an RPC service that fails under high load. The following part of the request middleware looks correct, but still the entire service goes down whenever just one customer sends a high volume of requests:

logEntry.IPAddress = req.IPAddress
logEntry.BackendServer = proto.String(hostname)
// The redactIP() function redacts IPAddress to 127.0.0.1,
// unexpectedly not just in logEntry *but also* in req!
go auditlog(redactIP(logEntry))
if quotaExceeded(req) {
	// BUG: All requests end up here, regardless of their source.
	return fmt.Errorf("server overloaded")
}

Did you spot the bug? The first line accidentally copied the pointer (thereby sharing the pointed-to variable between the logEntry and req messages) instead of its value. It should have read:

logEntry.IPAddress = proto.String(req.GetIPAddress())

The new Opaque API prevents this problem as the setter takes a value (string) instead of a pointer:

logEntry.SetIPAddress(req.GetIPAddress())

Motivation: Fix Sharp Edges: reflection

To write code that works not only with a specific message type (e.g. logpb.LogEntry), but with any message type, one needs some kind of reflection. The previous example used a function to redact IP addresses. To work with any type of message, it could have been defined as func redactIP(proto.Message) proto.Message { … }.

Many years ago, your only option to implement a function like redactIP was to reach for Go’s reflect package, which resulted in very tight coupling: you had only the generator output and had to reverse-engineer what the input protobuf message definition might have looked like. The google.golang.org/protobuf module release (from March 2020) introduced Protobuf reflection, which should always be preferred: Go’s reflect package traverses the data structure’s representation, which should be an implementation detail. Protobuf reflection traverses the logical tree of protocol messages without regard to its representation.

Unfortunately, merely providing protobuf reflection is not sufficient and still leaves some sharp edges exposed: In some cases, users might accidentally use Go reflection instead of protobuf reflection.

For example, encoding a protobuf message with the encoding/json package (which uses Go reflection) was technically possible, but the result is not canonical Protobuf JSON encoding. Use the protojson package instead.

The new Opaque API prevents this problem because the message struct fields are hidden: accidental usage of Go reflection will see an empty message. This is clear enough to steer developers towards protobuf reflection.

Motivation: Making the ideal memory layout possible

The benchmark results from the More Efficient Memory Representation section have already shown that protobuf performance heavily depends on the specific usage: How are the messages defined? Which fields are set?

To keep Go Protobuf as fast as possible for everyone, we cannot implement optimizations that help only one program, but hurt the performance of other programs.

The Go compiler used to be in a similar situation, up until Go 1.20 introduced Profile-Guided Optimization (PGO). By recording the production behavior (through profiling) and feeding that profile back to the compiler, we allow the compiler to make better trade-offs for a specific program or workload.

We think using profiles to optimize for specific workloads is a promising approach for further Go Protobuf optimizations. The Opaque API makes those possible: Program code uses accessors and does not need to be updated when the memory representation changes, so we could, for example, move rarely set fields into an overflow struct.

Migration

You can migrate on your own schedule, or even not at all—the (existing) Open Struct API will not be removed. But, if you’re not on the new Opaque API, you won’t benefit from its improved performance, or future optimizations that target it.

We recommend you select the Opaque API for new development. Protobuf Edition 2024 (see Protobuf Editions Overview if you are not yet familiar) will make the Opaque API the default.

The Hybrid API

Aside from the Open Struct API and Opaque API, there is also the Hybrid API, which keeps existing code working by keeping struct fields exported, but also enabling migration to the Opaque API by adding the new accessor methods.

With the Hybrid API, the protobuf compiler will generate code on two API levels: the .pb.go is on the Hybrid API, whereas the _protoopaque.pb.go version is on the Opaque API and can be selected by building with the protoopaque build tag.

Rewriting Code to the Opaque API

See the migration guide for detailed instructions. The high-level steps are:

Enable the Hybrid API.
Update existing code using the open2opaque migration tool.
Switch to the Opaque API.

Advice for published generated code: Use Hybrid API

Small usages of protobuf can live entirely within the same repository, but usually, .proto files are shared between different projects that are owned by different teams. An obvious example is when different companies are involved: To call Google APIs (with protobuf), use the Google Cloud Client Libraries for Go from your project. Switching the Cloud Client Libraries to the Opaque API is not an option, as that would be a breaking API change, but switching to the Hybrid API is safe.

Our advice for such packages that publish generated code (.pb.go files) is to switch to the Hybrid API please! Publish both the .pb.go and the _protoopaque.pb.go files, please. The protoopaque version allows your consumers to migrate on their own schedule.

Enabling Lazy Decoding

Lazy decoding is available (but not enabled) once you migrate to the Opaque API! 🎉

To enable: in your .proto file, annotate your message-typed fields with the [lazy = true] annotation.

To opt out of lazy decoding (despite .proto annotations), the protolazy package documentation describes the available opt-outs, which affect either an individual Unmarshal operation or the entire program.

Next Steps

By using the open2opaque tool in an automated fashion over the last few years, we have converted the vast majority of Google’s .proto files and Go code to the Opaque API. We continuously improved the Opaque API implementation as we moved more and more production workloads to it.

Therefore, we expect you should not encounter problems when trying the Opaque API. In case you do encounter any issues after all, please let us know on the Go Protobuf issue tracker.

Reference documentation for Go Protobuf can be found on protobuf.dev → Go Reference.

A year ago, I got a solar panel for my balcony — an easy way to vote with your wallet to convert more of the world’s energy usage to solar power. That was a great decision and I would recommend everyone get a solar panel (or two)!

It’s called plug-in solar panel because you can just plug it in

In my experience, many people are surprised about the basics of how power works: You do not need to connect devices to a battery in order to enjoy solar power. You can just plug in the solar panel into your household electricity setup. Any of your consumers (like a TV, or electric cooktop) will now use the power that your solar panel produces before consuming power from the grid.

Here’s the panel I have (Weber barbecue for scale). As you can see, the panel is not yet mounted at an angle, just hung over the balcony. The black box at the back of the panel is the inverter (“Wechselrichter”). You connect the panel on one side and get electricity out the other side.

Which solar panel to buy?

There are two big questions to answer when chosing a solar panel: what peak capacity should your panel(s) have and which company / seller do you buy from?

Regarding panel capacity: When I look at my energy usage, I see about 100 watts of baseline load. This includes always-on servers and other home automation devices. During working hours, running a PC and (power-hungry) monitor adds another 100 watts or so. Around noon, there is quite a spike in usage when cooking with my induction cooktop.

Hence, I figured a plug & play solar panel with the smallest size of 385 Wp would be well equipped to cover baseline usage, compared to the next bigger unit with 780 Wp, which seems oversized for my usage. Note that a peak capacity of 385 Wp will not necessarily mean that you will measure 380W of output. I did repeatedly measure energy production exceeding 300W.

Regarding the company, the best offer I found in Switzerland was a small company called erneuer.bar, which means “renewable” in German. They ship the panels with barely any packaging in fully electric vehicles and their offer is eligible for the topten bonus program from EWZ, meaning you’ll get back 200 CHF if you fill in a form.

The specific model I ordered was called “385 Wp Plug & Play Solar (DE)”. Here’s the bill:

Produkt	Preis
385 Wp Plug & Play Solar (DE)	CHF 520.00
Mounting kit: balcony, 1 panel	CHF 75.00
Pre-mount: mounting kit balcony	CHF 60.00
WiFi measurement myStrom	CHF 55.00
Shipping	CHF 68.00
Total	CHF 778.00

Of course, you can save some money in various ways. For example, the measurement device and pre-mount option are both not required, but convenient. Similarly, you can probably find solar panels for cheaper, but the offer that erneuer.bar has put together truly is very convenient and works well, and to me that’s worth some money.

One mistake I made when ordering is selecting a 5m cable. It turned out I needed a 10m cable, so I recommend you measure better than I did (or just select the longer cable). On the plus side, customer service was excellent: I quickly received an email response and could just send back my cable in exchange for a new one.

Amortization? Who cares!

Many people seem to consider only the financial aspect of buying a solar panel and calculate when the solar panel will have paid for itself. I don’t care. My goal is to convert more energy usage to green energy, not to save money.

Similarly, some people install batteries so that they can use “their” energy for themselves, in case the solar panel produces more than they use at that moment. I couldn’t care less who uses the energy I produce — as long as it’s green energy, anyone is welcome to consume it.

(Of course I understand these questions become more important the larger a solar installation gets. But we’re talking about one balcony and one solar panel (or two) covering someone’s baseline residential household electricity load. Don’t overthink it!)

Requirement: balcony power socket

Aside from having a balcony, there is only one hard requirement: you need a power socket.

This requirement is either trivially fulfilled if you already have an outdoor power socket on your balcony (lucky you!), or might turn out to be the most involved part of the project. Either way, because an electrician needs to install power sockets, all you can do is get permission from your landlord and make an appointment with your electrician of choice.

In terms of cost, you will probably spend a few hundred bucks, depending on your area’s cost of living. A good idea that did not occur to me back then: Ask around in your house if any neighbors would be interested in getting a balcony power socket, too, and do it all in one go (for cheaper).

Bureaucracy

One can easily find stories online about electricity providers and landlords not permitting the installation of solar panels for… rather questionable reasons. For example, some claimed that solar panels could overload the house electricity infrastructure! A drastic-sounding claim, but nonsense in practice. Luckily, law makers are recognizing this and are removing barriers.

Electricity provider and the law

In Switzerland 🇨🇭, you can connect panels producing up to 600W without an electrician, but you need to notify your electricity provider.

In Germany 🇩🇪, you can connect panels producing up to 800W (as of May 16th 2024) without an electrician, but you need to register with the Bundesnetzagentur.

Be sure to check your country’s laws and your electricity provider’s rules and processes.

Landlord and neighbors

In Switzerland 🇨🇭, you need to ask your landlord for permission because if your solar panel were to fall down from the balcony, the landlord would be liable. Usually, the landlord insists on proper mounting and the tenant taking over liability. In my case, the landlord also asked me to ensure the neighbors wouldn’t mind. I put up a letter, nobody complained, the landlord accepted.

In Germany 🇩🇪, you do need to ask your landlord for permission, but the landlord pretty much has to agree (as of October 17th 2024). The question is not “if”, but “how” the landlord wants you to install the solar panel.

Optimizing the installation angle

Earlier I wrote that you can just hang the solar panel onto your balcony and plug it in. While this is true, there is one factor that is worth optimizing (as time permits): the installation angle.

If you want more details about the physics background and various considerations that go into chasing the optimal angle, check out these (German) articles about optimizing the installation angle (at Golem) or sizing solar installations (at Heise). I’ll summarize: the angle is important and can result in twice as much energy production! Any angle is usually better than no angle.

In my case, I first “installed” the solar panel (no angle) at 2023-09-30.

Then, about a month later, I installed it at an angle at 2023-10-28.

I unfortunately don’t have a great before/after graph because after I installed the proper angle mount, there were almost no sunny days.

Instead, I will show you data from a comparable time range (early October) in 2023 (before mounting the panel at an angle) and in 2024 (with a properly mounted panel). As you can see, the difference is not that huge, but clearly visible: without an angle mount, I could never exceed 300 Wh per day. With a proper mount, a number of days exceed 300 Wh:

	1st	2nd	3rd	4th	5th	6th	7th	8th	9th	10th
2023 🌞 Wh	133	268	262	208	271	255	274	277	275	194
2024 🌞 Wh	529	119	246	205	160	324	265	335	73	444

How much electricity does my panel generate?

The exact electricity production numbers depend on how much sun ends up on the solar panel. This in turn depends on the weather and how obstructed the solar panel is (neighbors, trees, …).

I like measuring things, so I will share some measurements to give you a rough idea. But note that measuring your solar panel is strictly optional.

On the best recorded day, my panel produced about 1.680 kWh of energy:

The missing parts before 14:00 are caused by the neighbor’s house blocking the sun.

Now, compare this best case with the worst case, a January day with little sun (< 50 Wh):

Let’s zoom out a bit and consider an entire year instead.

In 2024, the panel produced over 177 kWh so far, or, averaged to the daily value, ≈0.5 kWh/day:

Or, in numeric form (all numbers in kWh):

Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2.9	6.6	10.9	18.4	29.1	27.7	37.6	22.1	12.0	6.5	3.3	n/a

Conclusion

A solar panel is a great project to make incremental progress on. It’s just 3 to 4 simple steps, each of which is valuable on its own:

Check with your landlord that installing an outdoor power socket and solar panel is okay.
- Even if you personally do not go any further with your project, you can share the result with your neighbors, who might…
Order an outdoor power socket from your (or your landlord’s) preferred electrician.
- Power will come in handy for lighting when spending summer evenings on the balcony.
Order a solar panel and plug it in.
Optional, but recommended: Optimize the mounting angle later.

That’s it! Come on, get started right away 🌞

Let’s say you created a Go program that stores data in PostgreSQL — you installed PostgreSQL, wrote the Go code, and everything works; great!

But after writing a test for your code, you wonder: how do you best provide PostgreSQL to your automated tests? Do you start a separate PostgreSQL in a Docker container, for example, or do you maybe reuse your development PostgreSQL instance?

I have come to like using ephemeral PostgreSQL instances for their many benefits:

Easier development setup: no need to configure a database, installation is enough.
I recommend installing PostgreSQL from your package manager, e.g. apt install postgresql (Debian) or brew install postgresql (macOS). No need for Docker :)
No risk of “works on my machine” (but nowhere else) problems: every test run starts with an empty database instance, so your test must set up the database correctly.
The same approach works locally and on CI systems like GitHub Actions.

In this article, I want to show how to integrate ephemeral PostgreSQL instances into your test setup. The examples are all specific to Go, but I expect that users of other programming languages and environments can benefit from some of these techniques as well.

Single-package tests

When you are in the very early stages of your project, you might start out with just a single test file (say, app_test.go), containing one or more test functions (say, TestSignupForm).

In this scenario, all tests will run in the same process. While it’s easy enough to write a few lines of code to start and stop PostgreSQL, I recommend reaching for an existing test helper package.

Throughout this article, I will be using the github.com/stapelberg/postgrestest package, which is based on Roxy Light’s postgrestest package but was extended to work well in the scenarios this article explains.

To start an ephemeral PostgreSQL instance before your test functions run, you would declare a custom TestMain function:

var pgt *postgrestest.Server

func TestMain(m *testing.M) {
	var err error
	pgt, err = postgrestest.Start(context.Background())
	if err != nil {
		panic(err)
	}
	defer pgt.Cleanup()

	m.Run()
}

Starting a PostgreSQL instance takes about:

300ms on my Intel Core i9 12900K CPU (from 2022)
800ms on my MacBook Air M1 (from 2020)

Then, you can create a separate database for each test on this ephemeral Postgres instance:

func TestSignupForm(t *testing.T) {
	pgurl, err := pgt.CreateDatabase(context.Background())
	if err != nil {
		t.Fatal(err)
	}
	// test goes here…
}

Each CreateDatabase call takes about:

5-10ms on my Intel Core i9 12900K CPU (from 2022)
20ms on my MacBook Air M1 (from 2020)

Usually, most projects quickly grow beyond just a single _test.go file.

In one project if mine, I eventually reached over 50 test functions in 25 Go packages. I stuck to the above approach of adding a custom TestMain to each package in which my tests needed PostgreSQL, and my test runtimes eventually looked like this:

# Intel Core i9 12900K
CGO_ENABLED=0 GOGC=off go test -count=1 -fullpath ./...
14,24s user 4,11s system 709% cpu 2,586 total

# MacBook Air M1
CGO_ENABLED=0 GOGC=off go test -count=1 -fullpath ./...
20,23s user 8,67s system 350% cpu 8,257 total

That’s not terrible, but not great either.

If you happen to open a process monitor while running tests, you might have noticed that there are quite a number of PostgreSQL instances running. This seems like something to optimize! Shouldn’t one PostgreSQL instance be enough for all tests of a test run?

Let’s review the process model of go test before we can talk about how to integrate with it.

go test process model

The usual command to run all tests of a Go project is go test ./... (see go help packages for details on the /... pattern syntax), which matches the Go package in the current directory and all Go packages in its subdirectories.

Each Go package (≈ directory), including _test.go files, is compiled into a separate test binary:

% go help test
[…]
'Go test' recompiles each package along with any files with names matching
the file pattern "*_test.go".
[…]
Each listed package causes the execution of a separate test binary.
[…]

These test binaries are then run in parallel. In fact, there are two levels of parallelism at play here:

All test functions (within a single test binary) that call t.Parallel() will be run in parallel (in batches of size -parallel).
go test will run different test binaries in parallel.

The documentation explains that the -parallel test flag defaults to GOMAXPROCS and references the go test parallelism:

% go help testflag
[…]
-parallel n
    Allow parallel execution of test functions that call t.Parallel, and
    fuzz targets that call t.Parallel when running the seed corpus.
    The value of this flag is the maximum number of tests to run
    simultaneously.
[…]
    By default, -parallel is set to the value of GOMAXPROCS.
    Setting -parallel to values higher than GOMAXPROCS may cause degraded
    performance due to CPU contention, especially when fuzzing.
    Note that -parallel only applies within a single test binary.
    The 'go test' command may run tests for different packages
    in parallel as well, according to the setting of the -p flag
    (see 'go help build').

The go test parallelism is controlled by the -p flag, which also defaults to GOMAXPROCS:

% go help build
[…]
-p n
	the number of programs, such as build commands or
	test binaries, that can be run in parallel.
	The default is GOMAXPROCS, normally the number of CPUs available.
[…]

To print GOMAXPROCS on a given machine, we can run a test program like this gomaxprocs.go:

package main

import "runtime"

func main() {
	print(runtime.GOMAXPROCS(0))
}

For me, GOMAXPROCS defaults to the 24 threads of my Intel Core i9 12900K CPU, which has 16 cores (8 Performance, 8 Efficiency; only the Performance cores have Hyper Threading):

% go run gomaxprocs.go
24
% grep 'model name' /proc/cpuinfo | wc -l
24

So with a single go test ./... command, we can expect 24 parallel processes each running 24 tests in parallel. With our current approach, we would start up to 24 concurrent ephemeral PostgreSQL instances (if we have that many packages), which seems wasteful to me.

Starting one ephemeral PostgreSQL instance per go test run seems better.

How can we go from starting 24 Postgres instances to starting just one?

First, we need to update our test setup code to work with a passed-in database URL. For that, we switch from calling CreateDatabase to using a DBCreator for a database identified by a URL. The old code still needs to remain so that you can run a single test without bothering with PGURL:

var dbc *postgrestest.DBCreator

func TestMain(m *testing.M) {
	// It is best to specify the PGURL environment variable so that only
	// one PostgreSQL instance is used for all tests.
	pgurl := os.Getenv("PGURL")
	if pgurl == "" {
		// 'go test' was started directly, start one Postgres per process:
		pgt, err := postgrestest.Start(context.Background())
		if err != nil {
			panic(err)
		}
		defer pgt.Cleanup()
		pgurl = pgt.DefaultDatabase()
	}

	var err error
	dbc, err = postgrestest.NewDBCreator(pgurl)
	if err != nil {
		panic(err)
	}

	m.Run()
}

Inside the test function(s), we only need to update the CreateDatabase receiver name:

func TestSignupForm(t *testing.T) {
	pgurl, err := dbc.CreateDatabase(context.Background())
	if err != nil {
		t.Fatal(err)
	}
	// test goes here…
}

Then, we create a new wrapper program (e.g. internal/cmd/initpg/initpg.go) which calls postgrestest.Start and passes the PGURL environment variable to the process(es) it starts:

// initpg is a small test helper command which starts a Postgres
// instance and makes it available to the wrapped 'go test' command.
package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"os/exec"

	"github.com/stapelberg/postgrestest"

	// Use the same database driver as in the rest of your project.
	_ "github.com/lib/pq"
)

func runWrappedCommand(pgurl string) error {
	// os.Args[0] is initpg
	// os.Args[1] is --
	// os.Args[2] is go
	// os.Args[3] is test
	// etc.
	wrapped := exec.Command(os.Args[2], os.Args[3:]...)
	wrapped.Stdin = os.Stdin
	wrapped.Stdout = os.Stdout
	wrapped.Stderr = os.Stderr
	wrapped.Env = append(os.Environ(), "PGURL="+pgurl)
	if err := wrapped.Run(); err != nil {
		return fmt.Errorf("%v: %v", wrapped.Args, err)
	}
	return nil
}

func initpg() error {
	pgt, err := postgrestest.Start(context.Background())
	// NOTE: keep reading the article, do not submit as-is
	if err != nil {
		return err
	}
	defer pgt.Cleanup()

	// Run the wrapped command ('go test', typically)
	return runWrappedCommand(pgt.DefaultDatabase())
}

func main() {
	if err := initpg(); err != nil {
		log.Fatal(err)
	}
}

Running the initpg wrapper program

While we could use go run ./internal/cmd/initpg to compile and run this wrapper program, it is a bit wasteful to recompile this program over and over when it rarely changes.

One alternative is to use go install instead of go run. I have two minor concerns with that:

go install installs into the bin directory, which is ~/go/bin by default.
- This means we need to rely on the PATH environment variable containing the bin directory to run the installed program. Unfortunately, influencing or determining the go install destination path is tricky.
- It would be nice to not litter the user’s bin directory. I think the bin directory should contain programs which the user explicitly requested to install, not helper programs that are only necessary to run tests.
On my machine, go install takes about 100ms, even when nothing has changed.

I like to define a Makefile in each of my projects with a set of targets that are consistently named, e.g. make test, make push, etc. Given that I already use make, I like to set up my Makefile to build initpg in the _bin directory:

.PHONY: test

_bin/initpg: internal/cmd/initpg/initpg.go
	mkdir -p _bin
	go build -o _bin/initpg ./internal/cmd/initpg

test: _bin/initpg
	./_bin/initpg -- go test ./...

Because initpg.go rarely changes, the program will typically not need to be recompiled.

Note that this Makefile is only approximately correct: initpg’s dependency on postgrestest is not modeled, so you need to delete _bin/initpg to pick up changes to postgrestest.

Performance

Let’s compare the before and after test runtimes on the Intel Core i9 12900K:

# Intel Core i9 12900K: one Postgres for each test
CGO_ENABLED=0 GOGC=off go test -count=1 -fullpath ./...
14,24s user 4,11s system 709% cpu 2,586 total

# Intel Core i9 12900K: one Postgres shared among all tests
CGO_ENABLED=0 GOGC=off ./_bin/initpg -- go test -count=1 -fullpath ./...
11,40s user 3,10s system 659% cpu 2,199 total

For comparison, the effect is more pronounced on the MacBook Air M1:

# MacBook Air M1: one Postgres for each test
CGO_ENABLED=0 GOGC=off go test -count=1 -fullpath ./...
20,23s user 8,67s system 350% cpu 8,257 total

# MacBook Air M1: one Postgres shared among all tests
CGO_ENABLED=0 GOGC=off ./_bin/initpg -- go test -count=1 -fullpath ./...
14,25s user 4,36s system 275% cpu 6,752 total

Sharing one PostgreSQL instance has reduced the total test runtime for a full run by about 20%!

Why is it sometimes slower?

We have measurably reduced the runtime of a full test run, but if you pay close attention during development you will notice that now every test run is a full test run, even when you only change a single package!

Why can Go no longer cache any of the test results? The problem is that the PGURL environment variable has a different value on each run: the name of the temporary directory that the postgrestest package uses for its ephemeral database instance changes on each run.

The documentation on the go test caching behavior explains this in the last paragraph:

% go help test
[…]
In package list mode only, go test caches successful package test
results to avoid unnecessary repeated running of tests. When the
result of a test can be recovered from the cache, go test will
redisplay the previous output instead of running the test binary
again. When this happens, go test prints '(cached)' in place of the
elapsed time in the summary line.

The rule for a match in the cache is that the run involves the same
test binary and the flags on the command line come entirely from a
restricted set of 'cacheable' test flags, defined as -benchtime, -cpu,
-list, -parallel, -run, -short, -timeout, -failfast, -fullpath and -v.
If a run of go test has any test or non-test flags outside this set,
the result is not cached. To disable test caching, use any test flag
or argument other than the cacheable flags. The idiomatic way to disable
test caching explicitly is to use -count=1.

Tests that open files within the package's source root (usually $GOPATH)
or that consult environment variables only match future runs in which
the files and environment variables are unchanged.
[…]

(See also Go issue #22593 for more details.)

Fixing Go test caching (env vars)

For the Go test caching to work, all environment variables our tests access (including PGURL) need to contain the same value between runs. For us, this means we cannot use a randomly generated name for the Postgres data directory, but instead need to use a fixed name.

My postgrestest package offers convenient support for specifying the desired directory:

func initpg() error {
	cacheDir, err := os.UserCacheDir()
	if err != nil {
		return err
	}
	pgt, err := postgrestest.Start(context.Background(),
		postgrestest.WithDir(filepath.Join(cacheDir, "initpg.gus")))
	if err != nil {
		return err
	}
	defer pgt.Cleanup()

	// Run the wrapped command ('go test', typically)
	return runWrappedCommand(pgt.DefaultDatabase())
}

When running the tests now, starting with the second run (without any changes), you should see a “ (cached)” suffix printed behind tests that were successfully cached, and the test runtime should be much shorter — under a second in my project:

% time ./_bin/initpg -- go test -fullpath ./...
ok  	example/internal/handlers/adminhandler	(cached)
[…]
./_bin/initpg -- go test -fullpath ./...
1,30s user 0,88s system 288% cpu 0,756 total

Conclusion

In this article, I have shown how to integrate PostgreSQL into your test environment in a way that is convenient for developers, light on system resources and measurably reduces total test time.

Adopting postgrestest seems easy enough to me. If you want to see a complete example, see how I converted the gokrazy/gus repository to use postgrestest.

Further optimization potential

Now that we have a detailed understanding of the go test process model and PostgreSQL startup, we can consider further optimizations. I won’t actually implement them in this article, which is already long enough, but maybe you want to go further in your project…

Hide Postgres startup

My journey into ephemeral PostgreSQL instances started with Eric Radman’s pg_tmp shell script. Ultimately, I ended up with the postgrestest Go solution that I much prefer: I don’t need to ship (or require) the pg_tmp shell script with my projects. The fewer languages, the better.

Also, pg_tmp is not a wrapper program, which resulted in problems regarding cleanup: A wrapper program can reliably trigger cleanup when tests are done, whereas pg_tmp has to poll for activity. Polling is prone to running too quickly (cleaning up a database before tests were even started) or too slowly, requiring constant tuning.

But, pg_tmp does have quite a clever concept of preparing PostgreSQL instances in the background and thereby amortizing startup costs between test runs.

There might be an even simpler approach that could amount to the same startup latency hiding behavior: Turning the sequential startup (initpg needs to wait for PostgreSQL to start and only then can begin running go test) into parallel startup using Socket Activation.

Note that PostgreSQL does not seem to support Socket Activation natively, so probably one would need to implement a program-agnostic solution into initpg as described in this Unix Stack Exchange question or Andreas Rammhold’s blog post.

De-duplicate schema creation cost

For isolation, we use a different PostgreSQL database for every test. This means we need to initialize the database schema for each of these per-test databases.

We can eliminate this duplicative work by sharing the same database across all tests, provided we have another way of isolating the tests from each other.

The txdb package provides a standard database/sql.Driver which runs all queries of an entire test in a single transaction. Using txdb means we can now safely share the same database between tests without running into conflicts, failing tests, or needing extra locking.

Be sure to initialize the database schema before using txdb to share the database: long-running transactions needs to lock the PostgreSQL catalog as soon as you change the database schema (i.e. create or modify tables), meaning only one test can run at a time. (Using go tool trace is a great way to understand such performance issues.)

I am aware that some people don’t like the transaction isolation approach. For example, Gajus Kuizinas’s blog post “Setting up PostgreSQL for running integration tests” finds that transactions don’t work in their (JavaScript) setup. I don’t share this experience at all: In Go, the txdb package works well, even with nested transactions. I have used txdb for months without problems.

In my tests, eliminating this duplicative schema initialization work saves about:

0.5s on my Intel Core i9 12900K
1s on the MacBook Air M1

Not all bugs can easily be reproduced — sometimes, all you have is a core dump from a crashing program, but no idea about the triggering conditions of the bug yet.

When using Go, we can use the delve debugger for core dump debugging, but I had trouble figuring out how to save byte slice contents (for example: the incoming request causing the crash) from memory into a file for further analysis, so this article walks you through how to do it.

Simple Example

Let’s imagine the following scenario: You are working on a performance optimization in Go Protobuf and have accidentally badly broken the proto.Marshal function. The function is now returning an error, so let’s run one of the failing tests with delve:

~/protobuf/proto master % dlv test
(dlv) b ExampleMarshal
(dlv) c
> [Breakpoint 1] google.golang.org/protobuf/proto_test.ExampleMarshal() ./encode_test.go:293 (hits goroutine(1):1 total:1) (PC: 0x9d6c96)
(dlv) next 4
> google.golang.org/protobuf/proto_test.ExampleMarshal() ./encode_test.go:297 (PC: 0xb54495)
   292: // [google.golang.org/protobuf/types/known/durationpb.New].
   293: func ExampleMarshal() {
   294: b, err := proto.Marshal(&durationpb.Duration{
   295: Nanos: 125,
   296: })
=> 297: if err != nil {
   298: panic(err)
   299: }
   300:
   301: fmt.Printf("125ns encoded into %d bytes of Protobuf wire format:\n% x\n", len(b), b)
   302:

Go Protobuf happens to return the already encoded bytes even when returning an error, so we can inspect the b byte slice to see how far the encoding got before the error happened:

(dlv) print b
[]uint8 len: 2, cap: 2, [16,125]

In this case, we can see that the entire (trivial) message was encoded, so our error must happen at a later stage — this allows us to rule out a large chunk of code in our search for the bug.

But what would we do if a longer part of the message was displayed and we wanted to load it into a different tool for further analysis, e.g. the excellent protoscope?

The low-tech approach is to print the contents and copy&paste from the delve output into an editor or similar. This stops working as soon as your data contains non-printable characters.

We have multiple options to export the byte slice to a file:

We could add os.WriteFile("/tmp/b.raw", b, 0644) to the source code and re-run the test. This is definitely the simplest option, as it works with or without a debugger.
As long as delve is connected to a running program, we can use delve’s call command to just execute the same code without having to add it to our source:
```
(dlv) call os.WriteFile("/tmp/b.raw", b, 0644)
(dlv)
```

Notably, both options only work when you can debug interactively. For the first option, you need to be able to change the source. The second option requires that delve is attached to a running process that you can afford to pause and interactively control.

These are trivial requirements when running a unit tests on your local machine, but get much harder when debugging an RPC service that crashes with specific requests, as you need to only run your changed debugging code for the troublesome requests, skipping the unproblematic requests that should still be handled normally.

Core dump debugging with Go

So let’s switch example: we are no longer working on Go Protobuf. Instead, we now need to debug an RPC service where certain requests crash the service. We’ll use core dump debugging!

In case you’re wondering: The name “core dump” comes from magnetic-core memory. These days we should probably say “memory dump” instead. The picture above shows an exhibit from the MIT Museum (Core Memory Unit, Bank C (from Project Whirlwind, 1953-1959)), a core memory unit with 4 KB of capacity.

To make Go write a core dump when panicing, run your program with the environment variable GOTRACEBACK=crash set (all possible values are documented in the runtime package).

You also need to ensure your system is set up to collect core dumps, as they are typically discarded by default:

On Linux, the easiest way is to install systemd-coredump(8) , after which core dumps will automatically be collected. You can use coredumpctl(1) to list and work with them.
On macOS, you can enable core dump collection, but delve cannot open macOS core dumps. Luckily, macOS is rarely used for production servers.
I don’t know about Windows and other systems.

You can find more details and options in the CoreDumpDebugging page of the Go wiki. For this article, we will stick to the coredumpctl route:

We’ll use the gRPC Go Quick start example, a greeter client/server program, and add a panic() call to the server SayHello handler:

% cd greeter_server
% go build -gcflags=all="-N -l"  # disable optimizations
% GOTRACEBACK=crash ./greeter_server
2024/10/19 21:48:01 server listening at [::]:50051
2024/10/19 21:48:03 Received: world
panic: oh no!

goroutine 5 gp=0xc000007c00 m=5 mp=0xc000100008 [running]:
panic({0x83ca60?, 0x9a3710?})
	/home/michael/sdk/go1.23.0/src/runtime/panic.go:804 +0x168 fp=0xc000169850 sp=0xc0001697a0 pc=0x46fe88
main.(*server).SayHello(0xcbb840?, {0x877200?, 0xc000094900?}, 0x4a6f25?)
	/home/michael/go/src/github.com/grpc/grpc-go/examples/helloworld/greeter_server/main.go:45 +0xbf fp=0xc0001698c0 sp=0xc000169850 pc=0x8037ff
[…]
signal: aborted (core dumped)

The last line is what we want to see: it should say “core dumped”.

We can now use coredumpctl(1) to launch delve for this program + core dump:

% coredumpctl debug --debugger=dlv --debugger-arguments=core
           PID: 1729467 (greeter_server)
           UID: 1000 (michael)
           GID: 1000 (michael)
        Signal: 6 (ABRT)
     Timestamp: Sat 2024-10-19 21:50:12 CEST (1min 49s ago)
  Command Line: ./greeter_server
    Executable: /home/michael/go/src/github.com/grpc/grpc-go/examples/helloworld/greeter_server/greeter_server
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (michael)
       Storage: /var/lib/systemd/coredump/core.greeter_server.1000.zst (present)
  Size on Disk: 204.7K
       Message: Process 1729467 (greeter_server) of user 1000 dumped core.
                
                Module /home/michael/go/src/github.com/grpc/grpc-go/examples/helloworld/greeter_server/greeter_server without build-id.
                Stack trace of thread 1729470:
                #0  0x0000000000479461 n/a (greeter_server + 0x79461)
[…]
                ELF object binary architecture: AMD x86-64

Type 'help' for list of commands.
(dlv) bt
 0  0x0000000000479461 in runtime.raise
    at /home/michael/sdk/go1.23.0/src/runtime/sys_linux_amd64.s:154
 1  0x0000000000451a85 in runtime.dieFromSignal
    at /home/michael/sdk/go1.23.0/src/runtime/signal_unix.go:942
 2  0x00000000004520e6 in runtime.sigfwdgo
    at /home/michael/sdk/go1.23.0/src/runtime/signal_unix.go:1154
 3  0x0000000000450a85 in runtime.sigtrampgo
    at /home/michael/sdk/go1.23.0/src/runtime/signal_unix.go:432
 4  0x0000000000479461 in runtime.raise
    at /home/michael/sdk/go1.23.0/src/runtime/sys_linux_amd64.s:153
 5  0x0000000000451a85 in runtime.dieFromSignal
    at /home/michael/sdk/go1.23.0/src/runtime/signal_unix.go:942
 6  0x0000000000439551 in runtime.crash
    at /home/michael/sdk/go1.23.0/src/runtime/signal_unix.go:1031
 7  0x0000000000439551 in runtime.fatalpanic
    at /home/michael/sdk/go1.23.0/src/runtime/panic.go:1290
 8  0x000000000046fe88 in runtime.gopanic
    at /home/michael/sdk/go1.23.0/src/runtime/panic.go:804
 9  0x00000000008037ff in main.(*server).SayHello
    at ./main.go:45
10  0x00000000008033a6 in google.golang.org/grpc/examples/helloworld/helloworld._Greeter_SayHello_Handler
    at /home/michael/go/src/github.com/grpc/grpc-go/examples/helloworld/helloworld/helloworld_grpc.pb.go:115
11  0x00000000007edeeb in google.golang.org/grpc.(*Server).processUnaryRPC
    at /home/michael/go/src/github.com/grpc/grpc-go/server.go:1394
12  0x00000000007f2eab in google.golang.org/grpc.(*Server).handleStream
    at /home/michael/go/src/github.com/grpc/grpc-go/server.go:1805
13  0x00000000007ebbff in google.golang.org/grpc.(*Server).serveStreams.func2.1
    at /home/michael/go/src/github.com/grpc/grpc-go/server.go:1029
14  0x0000000000477c21 in runtime.goexit
    at /home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:1700
(dlv)

Alright! Now let’s switch to frame 9 (our server’s SayHello handler) and inspect the Name field of the incoming RPC request:

(dlv) frame 9
> runtime.raise() /home/michael/sdk/go1.23.0/src/runtime/sys_linux_amd64.s:154 (PC: 0x482681)
Warning: debugging optimized function
Frame 9: ./main.go:45 (PC: aaabf8)
    40:	}
    41:	
    42:	// SayHello implements helloworld.GreeterServer
    43:	func (s *server) SayHello(_ context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
    44:		log.Printf("Received: %v", in.GetName())
=>  45:		panic("oh no!")
    46:		return &pb.HelloReply{Message: "Hello " + in.GetName()}, nil
    47:	}
    48:	
    49:	func main() {
    50:		flag.Parse()
(dlv) p in
("*google.golang.org/grpc/examples/helloworld/helloworld.HelloRequest")(0xc000120100)
*google.golang.org/grpc/examples/helloworld/helloworld.HelloRequest {
[…]
	unknownFields: []uint8 len: 0, cap: 0, nil,
	Name: "world",}

In this case, it’s easy to see that the Name field was set to world in the incoming request, but let’s assume the request contained lots of binary data that was not as easy to read or copy.

How do we write the byte slice contents to a file? In this scenario, we cannot modify the source code and delve’s call command does not work on core dumps (only when delve is attached to a running process):

(dlv) call os.WriteFile("/tmp/name.raw", in.Name, 0644)
> runtime.raise() /home/michael/sdk/go1.23.0/src/runtime/sys_linux_amd64.s:154 (PC: 0x482681)
Warning: debugging optimized function
Command failed: can not continue execution of core process

Luckily, we can extend delve with a custom Starlark function to write byte slice contents to a file.

Exporting byte slices with writebytestofile

You need a version of dlv that contains commit 52405ba. Until the commit is part of a released version, you can install the latest dlv directly from git:

% go install github.com/go-delve/delve/cmd/dlv@master

Save the following Starlark code to a file, for example ~/dlv_writebytestofile.star:

# Syntax: writebytestofile <byte slice var> <output file path>
def command_writebytestofile(args):
	var_name, filename = args.split(" ")
	s = eval(None, var_name).Variable
	mem = examine_memory(s.Base, s.Len).Mem
	write_file(filename, mem)

Then, in delve, load the Starlark code and run the function to export the byte slice contents of in.Name to /tmp/name.raw:

% coredumpctl debug --debugger=dlv --debugger-arguments=core
(dlv) frame 9
(dlv) source ~/dlv_writebytestofile.star
(dlv) writebytestofile in.Name /tmp/name.raw

Let’s verify that we got the right contents:

% hexdump -C /tmp/name.raw
00000000  77 6f 72 6c 64                                    |world|
00000005

Core dump debugging with `net/http` servers

When you want to apply the core dump debugging technique on a net/http server (instead of a gRPC server, as above), you will notice that panics in your HTTP handlers do not actually result in a core dump! This code in go/src/net/http/server.go recovers panics and logs a stack trace:

defer func() {
    if err := recover(); err != nil && err != ErrAbortHandler {
        const size = 64 << 10
        buf := make([]byte, size)
        buf = buf[:runtime.Stack(buf, false)]
        c.server.logf("http: panic serving %v: %v\n%s", c.remoteAddr, err, buf)
    }
}()

Or, in other words: the GOTRACEBACK=crash environment variable configures what happens for unhandled signals, but this signal is handled with the recover() call, so no core is dumped.

This default behavior of net/http servers is now considered regrettable but cannot be changed for compatibility. (We probably can add a struct field to optionally not recover panics, though. I’ll update this paragraph once there is a proposal.)

So, what options do we have in the meantime?

We could recover panics in our own code (before net/http’s panic handler is called), but then how do we produce a core dump from our own handler?

A closer look reveals that the Go runtime’s crash function is defined in signal_unix.go and sends signal SIGABRT with the dieFromSignal function to the current thread:

//go:nosplit
func crash() {
        dieFromSignal(_SIGABRT)
}

The default action for SIGABRT is to “terminate the process and dump core”, see signal(7) .

We can follow the same strategy and send SIGABRT to our process:

func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		defer func() {
			if err := recover(); err != nil {
				proc, err := os.FindProcess(syscall.Getpid())
				if err != nil {
					panic(fmt.Sprintf("could not find own process (pid %d): %v", syscall.Getpid(), err))
				}
				proc.Signal(syscall.SIGABRT)
				// Ensure the stack triggering the core dump sticks around
				proc.Wait()
			}
		}()
		// …buggy handler code goes here; for illustration we panic
		panic("this should result in a core dump")
	})
	log.Fatal(http.ListenAndServe(":8080", nil))
}

There is one caveat: If you have any non-Go threads running in your program, e.g. by using cgo, they might pick up the signal, so ensure they do not install a SIGABRT handler (see also: cgo-related documentation in os/signal).

If this is a concern, you can make the above code more platform-specific and use the tgkill(2) syscall to direct the signal to the current thread, as the Go runtime does.

Conclusion

Core dump debugging can be a very useful technique to quickly make progress on otherwise hard-to-debug problems. In small environments (single to few Linux servers), core dumps are easy enough to turn on and work with, but in larger environments you might need to invest into central core dump collection.

I hope the technique shown above comes in handy when you need to work with core dumps.

I have a couple of people who are best reachable on the Signal messaging app, but not that many. This exposes me to an awkward edge case of Signal’s design decisions: Whenever I get a message (on my phone), I want to reply to it (on my laptop) only to discover that Signal has un-linked my laptop because of inactivity and won’t sync the message history from my phone to my laptop, making it impossible to quote-reply to messages.

After complaining about this on Social Media for the n-th time, I figured I’d write a quick program to run Signal once a day, so that it won’t un-link my devices because of too long a period of inactivity. (Of course, the easiest solution would be to just run Signal in the background all the time. But I don’t use Signal often enough to justify letting it drain my battery and mobile data.)

In this article, I want to share the program in case it’s useful to anyone else, and also explain how to install it on a Mac, as this kind of “do a task once a day” automation is a useful pattern.

High-level sketch

Run Signal for, say, 5 minutes.
Ensure at-most-once semantics regardless of the task scheduler. For example, if I wanted to start this program from an @reboot hook and restart my computer a few times, I don’t want the program to do anything after the first run of the day. (Similarly, an on-online hook of NetworkManager or similar software might fire once per network interface, or something like that.)
Depending on the specifics of the activation mechanism, the computer might be online or not. The program should wait for a little while, say, 10 minutes, until internet connectivity was established.
I would like to log the program’s output (and Signal’s output) for debugging.

Checking connectivity

The easiest option is to just… not do a connectivity check at all, and hope for the best. This would probably work well enough in practice, but I would like the debug logs to have a high signal-to-noise ratio: If I have to debug why Signal was unlinked despite my automation attempts, I don’t want to comb through tons of spurious log messages that were a result from being offline. So, I want to check that I’m online before even starting Signal.

The most thorough option would be to somehow ask Signal programmatically whether it can connect to its servers and then wait until it can. I don’t think Signal has such an interface, so we’ll chose a middle-ground solution and work with a stand-in.

Using HTTP for connectivity checks is an easy way in today’s world. We just need a target website that doesn’t go offline unless I want it to. So let’s just use this website! Go’s net/http package that is included in Go’s standard library makes this super easy:

func checkConnectivity() error {
	_, err := http.Get("https://michael.stapelberg.ch/")
	return err
}

Now we just need to loop around this single connectivity check:

func waitForConnectivity(timeout time.Duration) error {
	const freq = 1 * time.Second
	for start := time.Now(); time.Since(start) < timeout; time.Sleep(freq) {
		if err := checkConnectivity(); err != nil {
			log.Printf("connectivity check failed: %v", err)
			continue
		}
		return nil // connectivity check succeeded
	}
	return fmt.Errorf("no connectivity established within %v", timeout)
}

We could improve this code to be more generally applicable by adding Exponential Backoff, but for this particular connectivity check, we should be fine even without Exponential Backoff.

Ensuring at-most-once semantics

An easy way to implement at-most-once semantics is to delegate to the file system: we can specify the O_EXCL flag when creating our program’s log file to make the first creation attempt proceed, but any further creation attempt fail because the file already exists. We’ll then redirect the standard library’s log package output to the log file:

logFn := filepath.Join(home, "signal-keepalive", "_logs", time.Now().Format("2006-01-02")+".txt")
f, err := os.OpenFile(logFn, os.O_RDWR|os.O_CREATE|os.O_EXCL, 0666)
if err != nil {
	if os.IsExist(err) {
		return nil // nothing to do, already ran today
	}
	return err
}
// Intentionally not closing this file so that even the log.Fatal()
// call in the calling function will end up in the log file.

log.SetOutput(f) // redirect standard library logging into this file
log.Printf("signal-keepalive, waiting for internet connectivity")

Not closing the file might seem weird at first, but remember that this is a short-lived program and the operating system closes all file handles of a process when it exits.

Full program code

For your convenience, here is the full program code. It contains a bunch of file system paths that you might want or need to adjust.

Click to expand: keepalive.go

package main

import (
	"fmt"
	"log"
	"net/http"
	"os"
	"os/exec"
	"path/filepath"
	"time"
)

func checkConnectivity() error {
	_, err := http.Get("https://michael.stapelberg.ch/")
	return err
}

func waitForConnectivity(timeout time.Duration) error {
	for start := time.Now(); time.Since(start) < timeout; time.Sleep(1 * time.Second) {
		if err := checkConnectivity(); err != nil {
			log.Printf("connectivity check failed: %v", err)
			continue
		}
		return nil // connectivity check succeeded
	}
	return fmt.Errorf("no connectivity established within %v", timeout)
}

func keepalive() error {
	// Limit to one attempt per day by exclusively creating a logfile.
	home := os.Getenv("HOME")
	if home == "" {
		home = "/Users/michael"
	}
	logFn := filepath.Join(home, "signal-keepalive", "_logs", time.Now().Format("2006-01-02")+".txt")
	f, err := os.OpenFile(logFn, os.O_RDWR|os.O_CREATE|os.O_EXCL, 0666)
	if err != nil {
		if os.IsExist(err) {
			return nil // nothing to do, already ran today
		}
		return err
	}
	// Intentionally not closing this file so that even the log.Fatal()
	// call in the calling function will end up in the log file.

	log.SetOutput(f) // redirect standard library logging into this file
	log.Printf("signal-keepalive, waiting for internet connectivity")

	// Wait for network connectivity
	if err := waitForConnectivity(10 * time.Minute); err != nil {
		return err
	}

	// Start signal
	log.Printf("connectivity verified, starting signal")
	signal := exec.Command("/Applications/Signal.app/Contents/MacOS/Signal", "--start-in-tray")
	signal.Stdout = f
	signal.Stderr = f
	if err := signal.Start(); err != nil {
		return err
	}

	// Wait for some time to give Signal a chance to synchronize messages.
	const signalWaitTime = 5 * time.Minute
	log.Printf("giving signal %v to sync messages", signalWaitTime)
	time.Sleep(signalWaitTime)

	// Stop signal
	log.Printf("killing signal")
	if err := signal.Process.Kill(); err != nil {
		return err
	}
	log.Printf("waiting for signal")
	log.Printf("signal returned: %v", signal.Wait())
	log.Printf("all done")

	return f.Sync()
}

func main() {
	if err := keepalive(); err != nil {
		log.Fatal(err)
	}
}

(Use go build keepalive.go to compile if you’re unfamiliar with Go.)

macOS installation: launchd

The corresponding piece of infrastructure to systemd on Linux is called launchd on macOS. Aside from managing daemon processes, launchd also supports time-triggered program execution, specifically via the StartCalendarInterval configuration option.

I followed Alvin Alexander’s blog post about launchd StartCalendarInterval examples and decided to configure my program to run at 08:03 each day:

Click to expand: net.zekjur.signalkeepalive.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>StartCalendarInterval</key>
    <dict>
      <key>Hour</key>
      <integer>8</integer>
      <key>Minute</key>
      <integer>3</integer>
    </dict>
    <key>Label</key>
    <string>net.zekjur.signalkeepalive</string>
    <key>Program</key>
    <string>/Users/michael/signal-keepalive/signalkeepalive</string>
</dict>
</plist>

What happens when my computer isn’t running at 08:03, for example because the lid is closed? Apple documents the behavior in the launchd.plist(5) man page:

Unlike cron which skips job invocations when the computer is asleep, launchd will start the job the next time the computer wakes up.

To install and test this configuration:

Copy the plist file to ~/Library/LaunchAgents
Run launchctl load ~/Library/LaunchAgents/net.zekjur.signalkeepalive.plist
Run launchctl start net.zekjur.signalkeepalive

In practice

It’s interesting to see this behavior in practice. Take note of the time stamps in the following log. The computer was not running at 08:03. At 08:18, it woke up to update background information (Apple calls this “Power Nap”), and then it suspended again (while Signal was running) until it woke up at 08:47 again:

2024/09/07 08:18:10 signal-keepalive, waiting for internet connectivity
2024/09/07 08:18:11 connectivity verified, starting signal
2024/09/07 08:18:11 giving signal 5m0s to sync messages
Set Windows Application User Model ID (AUMID) { AUMID: 'org.whispersystems.signal-desktop' }
NODE_ENV production
NODE_CONFIG_DIR /Applications/Signal.app/Contents/Resources/app.asar/config
NODE_CONFIG {}
ALLOW_CONFIG_MUTATIONS undefined
HOSTNAME m1a.fritz.box
NODE_APP_INSTANCE undefined
SUPPRESS_NO_CONFIG_WARNING undefined
SIGNAL_ENABLE_HTTP undefined
userData: /Users/michael/Library/Application Support/Signal
config/get: Successfully read user config file
config/get: Successfully read ephemeral config file
2024/09/07 08:47:31 killing signal
2024/09/07 08:47:31 waiting for signal
2024/09/07 08:47:31 signal returned: signal: killed
2024/09/07 08:47:31 all done

Linux installation: systemd

With systemd, we need two units. First, a signal-keepalive.service unit to declare which program should be run:

cat > ~/.config/systemd/user/signal-keepalive.service <<'EOT'
[Unit]
Description=signal keepalive
After=network.target

[Service]
Type=oneshot
ExecStart=/home/michael/signal-keepalive/signalkeepalive

[Install]
WantedBy=default.target
EOT

And secondly, a signal-keepalive.timer unit which automatically starts the signal-keepalive.service every day:

cat > ~/.config/systemd/user/signal-keepalive.timer <<'EOT'
[Unit]
Description=signal keepalive

[Timer]
Persistent=true
OnCalendar=daily

[Install]
WantedBy=timers.target
EOT

The Persistent=true line is important so that the program will be run even when the computer is asleep when the timer would have fired.

Let’s enable the timer:

systemctl --user enable --now signal-keepalive.timer

For an initial test run, we can start the .service directly:

systemctl --user restart signal-keepalive.service

Conclusion

It’s silly that I need to go through so much trouble just because I don’t use Signal enough.

I also don’t understand why Signal can’t just sync message history from my phone to my computer when linking. WhatsApp and Telegram have no trouble doing it.

Either way, I thought this was a fun little refresher on automating periodic jobs.

When I saw the first reviews of the ASRock DeskMini X600 barebone, I was immediately interested in building a home-lab hypervisor (VM host) with it. Apparently, the DeskMini X600 uses less than 10W of power but supports latest-generation AMD CPUs like the Ryzen 7 8700G!

Sounds like the perfect base for a power-efficient, always-on VM host that still provides enough compute power (and fast disks!) to be competitive with commercial VM offerings. In this article, I’ll show how I built and set up my DIY self-hosting VM host.

Component List

The term “barebone” means that the machine comes without CPU, RAM and disk. You only get a case with a mainboard and power supply, the rest is up to you. I chose the following parts:

Price	Type	Article
215 EUR	barebone	ASRock DeskMini X600
293 CHF	CPU	AMD Ryzen 7 8700G (AM5, 4.20 GHz, 8 Core)
48 CHF	CPU fan	Noctua NH-L9a-AM5 (37 mm)
195 CHF	RAM	Kingston FURY Impact (2 x 32GB, DDR5-5600 SO-DIMM)
218 CHF	SSD	2 x Samsung 980 Pro (1000 GB, M.2 2280) (for RAID-1)

Total cost: 969 CHF

The CPU fan is not strictly required (the DeskMini X600 already comes with a fan), but I wanted the best cooling performance at lowest noise levels, so Noctua it is.

~~I read that the machine should support ECC RAM, too~~. Update: The Ryzen 8700G does not support ECC-RAM after all. Only the Ryzen 7 PRO 8700G supports ECC-RAM.

It took me about an hour to assemble the parts. Note that the M.2 SSD screws might seem a little hard to screw in, but don’t be deterred by that. When first powering on the system, be patient as the memory training will take a minute or so, during which the screen will stay black.

UEFI Setup

The UEFI on the DeskMini X600 comes with reasonable defaults.

The CPU fan setting alreadys defaults to “Silent Mode”, for example.

I changed the following option, which is typical for server usage:

Advanced → ACPI Configuration → Restore on AC/Power Loss: Power On

And I disabled the onboard devices I know I won’t need, just in case it saves power:

Advanced → Onboard Devices Configuration → Onboard HD Audio: Disabled
SATA3 Controller: Disabled

Operating System Setup

I want to run this machine as a VM hypervisor. The easiest way that I know to set up such a hypervisor is to install Proxmox, an open source virtualization appliance based on Debian.

I booted the machine with the Proxmox installer copied to a USB memory stick, then selected ZFS in a RAID-1 configuration. The setup worked smoothly and was done in a few minutes.

Then, I set up Tailscale as recommended and used tailscale serve so that I can access the Proxmox web interface on its Tailscale hostname via HTTPS, instead of having to deal with certificates and custom ports:

pve# curl -fsSL https://tailscale.com/install.sh | sh
pve# tailscale up
[…]
  follow instructions and disable key expiration
[…]
pve# tailscale serve --bg https+insecure://localhost:8006

(Of course I’ll also install Tailscale on each VM running on the host.)

Now I can log into the Proxmox web interface from anywhere without certificate warnings:

In this screenshot, I have already created 2 VMs (“batch” and “web”) using the “Create VM” button at the top right. Proxmox allows controlling the installer via its “Console” tab and once set up, the VM shows up in the same network that the hypervisor is connected to with a MAC address from the “Proxmox Server Solutions GmbH” range. That’s pretty much all there is to it.

I don’t have enough nodes for advanced features like clustering, but I might investigate whether I want to set up backups on the Proxmox layer or keep doing them on the OS layer.

Fan speed monitoring

Sven Geggus shared how to make the fan speed sensors work in current versions of Debian:

pve# echo "options nct6683 force=1" >> /etc/modprobe.d/sensors.conf
pve# echo nct6683 >> /etc/modules-load.d/sensors.conf
pve# modprobe nct6683
pve# systemctl restart prometheus-node-exporter

Power Usage

The power usage values I measure are indeed excellent: The DeskMini X600 with Ryzen 7 8700G consumes less than 10W (idle)! When the machine has something to do, it spikes up to 50W:

Noise

ASRock explicitly lists the Noctua NH-L9a-AM5 as compatible with the DeskMini X600, which was one of the factors that made me select this barebone. Installing the fan was easy.

Fan noise is very low, as expected with Noctua. I can’t hear the device even when it is standing in front of me on my desk. Of course, under heavy load, the fan will be audible. This is an issue with all small form-factor PCs, as they just don’t have enough case space to swallow more noise.

Aside from the fan noise, if you hold your ear directly next to the X600, you can hear the usual electrical component noise (not coil whine per se, but that sort of thing).

I recommend positioning this device under a desk, or on a shelf, or similar.

Performance comparison

You can find synthetic benchmark results for the Ryzen 8700G elsewhere, so as usual, I will write about the specific angle I care about: How fast can this machine handle Go workloads?

Compiling Go 1.22.4

On the Ryzen 8700G, we can compile Go 1.22.4 in a little under 40 seconds:

% time ./make.bash
[…]
./make.bash  208,55s user 36,96s system 631% cpu 38,896 total

For comparison, my 2022 high-end Linux PC with Core i9-12900K is only a few seconds faster:

% time ./make.bash
[…]
./make.bash  207,33s user 29,55s system 685% cpu 34,550 total

Go HTTP and JSON benchmarks

I also ran the HTTP and JSON benchmarks from Go’s x/benchmarks repository.

Compared to the Virtual Server I’m currently renting, the Ryzen 8700G is more than twice as fast:

% benchstat rentedvirtual ryzen8700g 
name    old time/op                  new time/op                  delta
HTTP-2  28.5µs ± 2%                  10.2µs ± 1%  -64.17%  (p=0.008 n=5+5)
JSON-2  24.1ms ±29%                   9.4ms ± 1%  -61.06%  (p=0.008 n=5+5)

Of course, the Intel i9 12900K is still a bit faster — how much depends on the specific workload:

% benchstat ryzen8700g i9_12900k 
name    old time/op                  new time/op                  delta
HTTP-2  10.2µs ± 1%                   7.6µs ± 1%  -25.13%  (p=0.008 n=5+5)
JSON-2  9.40ms ± 1%                  9.23ms ± 1%   -1.82%  (p=0.008 n=5+5)

Conclusion

What a delightful little Mini-PC! It’s modern enough to house the current generation of CPUs, compact enough to fit in well anywhere, yet just large enough to fit a Noctua CPU cooler for super-quiet operation. The low power draw makes it acceptable to run this machine 24/7.

Paired with 64 GB of RAM and large, fast NVMe disks, this machine packs a punch and will easily power your home automation, home lab, hobby project, small office server, etc.

If a Raspberry Pi isn’t enough for your needs, check out the DeskMini X600, or perhaps its larger variant, the DeskMeet X600 which is largely identical, but comes with a PCIe slot.

If this one doesn’t fit your needs, keep looking: there are many more mini PCs on the market. Check out ServeTheHome’s “Project TinyMiniMicro” for a lot more reviews.

Update: Apparently ASRock is releasing their X600 mainboard as a standalone product, too, if you like the electronics but not the form factor.

Sometimes, you need to be able to constrain a type-parameter with a method, but that method should be defined on the pointer type. For example, say you want to parse some bytes using JSON and pass the result to a handler. You might try to write this as

func Handle[M json.Unmarshaler](b []byte, handler func(M) error) error {
	var m M
	if err := m.UnmarshalJSON(b); err != nil {
		return err
	}
	return handler(m)
}

However, this code does not work. Say you have a type Message, which implements json.Unmarshaler with a pointer receiver (it needs to use a pointer receiver, as it needs to be able to modify data):

If you try to call Handle[Message], you get a compiler error (playground). That is because Message does not implement json.Unmarshal, only *Message does.
If you try to call Handle[*Message], the code panics (playground), because var m M creates a *Message and initializes that to nil. You then call UnmarshalJSON with a nil receiver.

Neither of these options work. You really want to rewrite Handle, so that it says that the pointer to its type parameter implements json.Unmarshaler. And this is how to do that (playground):

type Unmarshaler[M any] interface {
	*M
	json.Unmarshaler
}

func Handle[M any, PM Unmarshaler[M]](b []byte, handler func(M) error) error {
	var m M
	// note: you need PM(&m), as the compiler can not infer (yet) that you can
	// call the method of PM on a pointer to M.
	if err := PM(&m).UnmarshalJSON(b); err != nil {
		return err
	}
	return handler(m)
}

I maintain two builds of the Linux kernel, a linux/arm64 build for gokrazy, my Go appliance platform, which started out on the Raspberry Pi, and then a linux/amd64 one for router7, which runs on PCs.

The update process for both of these builds is entirely automated, meaning new Linux kernel releases are automatically tested and merged, but recently the continuous integration testing failed to automatically merge Linux 6․7 — this article is about tracking down the root cause of that failure.

Background info on the bootloader

gokrazy started out targeting only the Raspberry Pi, where you configure the bootloader with a plain text file on a FAT partition, so we did not need to include our own UEFI/MBR bootloader.

When I ported gokrazy to work on PCs in BIOS mode, I decided against complicated solutions like GRUB — I really wasn’t looking to maintain a GRUB package. Just keeping GRUB installations working on my machines is enough work. The fact that GRUB consists of many different files (modules) that can go out of sync really does not appeal to me.

Instead, I went with Sebastian Plotz’s Minimal Linux Bootloader because it fits entirely into the Master Boot Record (MBR) and does not require any files. In bootloader lingo, this is a stage1-only bootloader. You don’t even need a C compiler to compile its (Assembly) code. It seemed simple enough to integrate: just write the bootloader code into the first sector of the gokrazy disk image; done. The bootloader had its last release in 2012, so no need for updates or maintenance.

You can’t really implement booting a kernel and parsing text configuration files in 446 bytes of 16-bit 8086 assembly instructions, so to tell the bootloader where on disk to load the kernel code and kernel command line from, gokrazy writes the disk offset (LBA) of vmlinuz and cmdline.txt to the last bytes of the bootloader code. Because gokrazy generates the FAT partition, we know there is never any fragmentation, so the bootloader does not need to understand the FAT file system.

Symptom

The symptom was that the rtr7/kernel pull request #434 for updating to Linux 6.7 failed.

My continuous integration tests run in two environments: a physical embedded PC from PC Engines (apu2c4) in my living room, and a virtual QEMU PC. Only the QEMU test failed.

On the physical PC Engines apu2c4, the pull request actually passed the boot test. It would be wrong to draw conclusions like “the issue only affects QEMU” from this, though, as later attempts to power on the apu2c4 showed the device boot-looping. I made a mental note that something is different about how the problem affects the two environments, but both are affected, and decided to address the failure in QEMU first, then think about the PC Engines failure some more.

In QEMU, the output I see is:

SeaBIOS (version Arch Linux 1.16.3-1-1)

iPXE (http://ipxe.org) 00:03.0 C900 PCI2.10 PnP PMM+06FD3360+06F33360 C900

Booting from Hard Disk...

Notably, the kernel doesn’t even seem to start — no “Decompressing linux” message is printed, the boot just hangs. I tried enabling debug output in SeaBIOS and eventually succeeded, but only with an older QEMU version:

Booting from Hard Disk...
Booting from 0000:7c00
In resume (status=0)
In 32bit resume
Attempting a hard reboot

This doesn’t tell me anything unfortunately.

Okay, so something about introducing Linux 6.7 into my setup breaks MBR boot.

I figured using Git Bisection should identify the problematic change within a few iterations, so I cloned the currently working Linux 6.6 source code, applied the router7 config and compiled it.

To my surprise, even my self-built Linux 6.6 kernel would not boot! 😲

Why does the router7 build work when built inside the Docker container, but not when built on my Linux installation? I decided to rebase the Docker container from Debian 10 (buster, from 2019) to Debian 12 (bookworm, from 2023) and that resulted in a non-booting kernel, too!

We have two triggers: building Linux 6.7 or building older Linux, but in newer environments.

Meta: Following Along

(Contains spoilers) Instructions for following along

First, check out the rtr7/kernel repository and undo the mitigation:

% mkdir -p go/src/github.com/rtr7/
% cd go/src/github.com/rtr7/
% git clone --depth=1 https://github.com/rtr7/kernel
% cd kernel
% sed -i 's,CONFIG_KERNEL_ZSTD,#CONFIG_KERNEL_ZSTD,g' cmd/rtr7-build-kernel/config.addendum.txt
% go run ./cmd/rtr7-rebuild-kernel
# takes a few minutes to compile Linux
% ls -l vmlinuz
-rw-r--r-- 1 michael michael 15885312 2024-01-28 16:18 vmlinuz

Now, you can either create a new gokrazy instance, replace the kernel and configure the gokrazy instance to use rtr7/kernel:

% gok -i mbr new
% gok -i mbr add .
% gok -i mbr edit
# Adjust to contain:
    "KernelPackage": "github.com/rtr7/kernel",
    "FirmwarePackage": "github.com/rtr7/kernel",
    "EEPROMPackage": "",

…or you skip these steps and extract my already prepared config to ~/gokrazy/mbr.

Then, build the gokrazy disk image and start it with QEMU:

% GOARCH=amd64 gok -i mbr overwrite \
  --full /tmp/gokr-boot.img \
  --target_storage_bytes=1258299392
% qemu-system-i386 \
  -nographic \
  -drive file=/tmp/gokr-boot.img,format=raw

Up/Downgrade Versions

Unlike application programs, the Linux kernel doesn’t depend on shared libraries at runtime, so the dependency footprint is a little smaller than usual. The most significant dependencies are the components of the build environment, like the C compiler or the linker.

So let’s look at the software versions of the known-working (Debian 10) environment and the smallest change we can make to that (upgrading to Debian 11):

Debian 10 (buster) contains gcc-8 (8.3.0-6) and binutils 2.31.1-16.
Debian 11 (bullseye) contains gcc-10 (10.2.1-6) and binutils 2.35.2-2.

To figure out if the problem is triggered by GCC, binutils, or something else entirely, I checked:

Debian 10 (buster) with its gcc-8, but with binutils 2.35 from bullseye still works. (Checked by updating /etc/apt/sources.list, then upgrading only the binutils package.)

Debian 10 (buster), but with gcc-10 and binutils 2.35 results in a non-booting kernel.

So it seems like upgrading from GCC 8 to GCC 10 triggers the issue.

Instead of working with a Docker container and Debian’s packages, you could also use Nix. The instructions aren’t easy, but I used nix-shell to quickly try out GCC 8 (works), GCC 9 (works) and GCC 10 (kernel doesn’t boot) on my machine.

New Hypothesis

To recap, we have two triggers: building Linux 6.7 or building older Linux, but with GCC 10.

Two theories seemed most plausible to me at this point: Either a change in GCC 10 (possibly enabled by another change in Linux 6.7) is the problem, or the size of the kernel is the problem.

To verify the file size hypothesis, I padded a known-working vmlinuz file to the size of a known-broken vmlinuz:

% ls -l vmlinuz
% dd if=/dev/zero bs=108352 count=1 >> vmlinuz

But, even though it had the same file size as the known-broken kernel, the padded kernel booted!

So I ruled out kernel size as a problem and started researching significant changes in GCC 10.

I read that GCC 10 changed behavior with regards to stack protection.

Indeed, building the kernel with Debian 11 (bullseye), but with CONFIG_STACKPROTECTOR=n makes it boot. So, I suspected that our bootloader does not set up the stack correctly, or similar.

I sent an email to Sebastian Plotz, the author of the Minimal Linux Bootloader, to ask if he knew about any issues with his bootloader, or if stack protection seems like a likely issue with his bootloader to him.

To my surprise (it has been over 10 years since he published the bootloader!) he actually replied: He hadn’t received any problem reports regarding his bootloader, but didn’t really understand how stack protection would be related.

Debugging with QEMU

At this point, we have isolated at least one trigger for the problem, and exhausted the easy techniques of upgrading/downgrading surrounding software versions and asking upstream.

It’s time for a Tooling Level Up! Without a debugger you can only poke into the dark, which takes time and doesn’t result in thorough explanations. Particularly in this case, I think it is very likely that any source modifications could have introduced subtle issues. So let’s reach for a debugger!

Luckily, QEMU comes with built-in support for the GDB debugger. Just add the -s -S flags to your QEMU command to make QEMU stop execution (-s) and set up a GDB stub (-S) listening on localhost:1234.

If you wanted to debug the Linux kernel, you could connect GDB to QEMU right away, but for debugging a boot loader we need an extra step, because the boot loader runs in Real Mode, but QEMU’s GDB integration rightfully defaults to the more modern Protected Mode.

When GDB is not configured correctly, it decodes addresses and registers with the wrong size, which throws off the entire disassembly — compare GDB’s output with our assembly source:

(gdb) b *0x7c00
(gdb) c
(gdb) x/20i $pc                         ; [expected (bootloader.asm)]
=> 0x7c00: cli                          ; => 0x7c00: cli
   0x7c01: xor    %eax,%eax             ;    0x7c01: xor %ax,%ax
   0x7c03: mov    %eax,%ds              ;    0x7c03: mov %ax,%ds
   0x7c05: mov    %eax,%ss              ;    0x7c05: mov %ax,%ss
   0x7c07: mov    $0xb87c00,%esp        ;    0x7c07: mov $0x7c00,%sp
   0x7c0c: adc    %cl,-0x47990440(%esi) ;    0x7c0a: mov $0x1000,%ax
   0x7c12: add    %eax,(%eax)           ;    0x7c0d: mov %ax,%es
   0x7c14: add    %al,(%eax)            ;    0x7c0f: sti
   0x7c16: xor    %ebx,%ebx

So we need to ensure we use qemu-system-i386 (qemu-system-x86_64 prints Remote 'g' packet reply is too long) and configure the GDB target architecture to 16-bit 8086:

(gdb) set architecture i8086
(gdb) target remote localhost:1234

Unfortunately, the above doesn’t actually work in QEMU 2.9 and newer: https://gitlab.com/qemu-project/qemu/-/issues/141.

On the web, people are working around this bug by using a modified target.xml file. I tried this, but must have made a mistake — I thought modifying target.xml didn’t help, but when I wrote this article, I found that it does actually seem to work. Maybe I didn’t use qemu-system-i386 but the x86_64 variant or something like that.

Using an older QEMU

It is typically an exercise in frustration to get older software to compile in newer environments.

It’s much easier to use an older environment to run old software.

By querying packages.debian.org, we can see the QEMU versions included in current and previous Debian versions.

Unfortunately, the oldest listed version (QEMU 3.1 in Debian 10 (buster)) isn’t old enough. By querying snapshot.debian.org, we can see that Debian 9 (stretch) contained QEMU 2.8.

So let’s run Debian 9 — the easiest way I know is to use Docker:

% docker run --net=host -v /tmp:/tmp -ti debian:stretch

Unfortunately, the debian:stretch Docker container does not work out of the box anymore, because its /etc/apt/sources.list points to the deb.debian.org CDN, which only serves current versions and no longer serves stretch.

So we need to update the sources.list file to point to archive.debian.org. To correctly install QEMU you need both entries, the debian line and the debian-security line, because the Docker container has packages from debian-security installed and gets confused when these are missing from the package list:

root@650a2157f663:/# cat > /etc/apt/sources.list <<'EOT'
deb http://archive.debian.org/debian/ stretch contrib main non-free
deb http://archive.debian.org/debian-security/ stretch/updates main
EOT
root@650a2157f663:/# apt update

Now we can just install QEMU as usual and start it to debug our boot process:

root@650a2157f663:/# apt install qemu-system-x86
root@650a2157f663:/# qemu-system-i386 \
  -nographic \
  -drive file=/tmp/gokr-boot.img,format=raw \
  -s -S

Now let’s start GDB and set a breakpoint on address 0x7c00, which is the address to which the BIOS loades the MBR code and starts execution:

% gdb
(gdb) set architecture i8086
The target architecture is set to "i8086".
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
(gdb) break *0x7c00
Breakpoint 1 at 0x7c00
(gdb) continue
Continuing.

Breakpoint 1, 0x00007c00 in ?? ()
(gdb)

Debug symbols

Okay, so we have GDB attached to QEMU and can step through assembly instructions. Let’s start debugging!?

Not so fast. There is another Tooling Level Up we need first: debug symbols. Yes, even for a Minimal Linux Bootloader, which doesn’t use any libraries or local variables. Having proper names for functions, as well as line numbers, will be hugely helpful in just a second.

Before debug symbols, I would directly build the bootloader using nasm bootloader.asm, but to end up with a symbol file for GDB, we need to instruct nasm to generate an ELF file with debug symbols, then use ld to link it and finally use objcopy to copy the code out of the ELF file again.

After commit d29c615 in gokrazy/internal/mbr, I have bootloader.elf.

Back in GDB, we can load the symbols using the symbol-file command:

(gdb) set architecture i8086
The target architecture is set to "i8086".
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
(gdb) symbol-file bootloader.elf
Reading symbols from bootloader.elf...
(gdb) break *0x7c00
Breakpoint 1 at 0x7c00: file bootloader.asm, line 48.
(gdb) continue
Continuing.

Breakpoint 1, ?? () at bootloader.asm:48
48		cli
(gdb)

Automation with .gdbinit

At this point, we need 4 commands each time we start GDB. We can automate these by writing them to a .gdbinit file:

% cat > .gdbinit <<'EOT'
set architecture i8086
target remote localhost:1234
symbol-file bootloader.elf
break *0x7c00
EOT

% gdb
The target architecture is set to "i8086".
0x0000fff0 in ?? ()
Breakpoint 1 at 0x7c00: file bootloader.asm, line 48.
(gdb)

Understanding program flow

The easiest way to understand program flow seems to be to step through the program.

But Minimal Linux Bootloader (MLB) contains loops that run through thousands of iterations. You can’t use gdb’s stepi command with that.

Because MLB only contains a few functions, I eventually realized that placing a breakpoint on each function would be the quickest way to understand the high-level program flow:

(gdb) b read_kernel_setup
Breakpoint 2 at 0x7c38: file bootloader.asm, line 75.
(gdb) b check_version
Breakpoint 3 at 0x7c56: file bootloader.asm, line 88.
(gdb) b read_protected_mode_kernel
Breakpoint 4 at 0x7c8f: file bootloader.asm, line 105.
(gdb) b read_protected_mode_kernel_2
Breakpoint 5 at 0x7cd6: file bootloader.asm, line 126.
(gdb) b run_kernel
Breakpoint 6 at 0x7cff: file bootloader.asm, line 142.
(gdb) b error
Breakpoint 7 at 0x7d51: file bootloader.asm, line 190.
(gdb) b reboot
Breakpoint 8 at 0x7d62: file bootloader.asm, line 204.

With the working kernel, we get the following transcript:

(gdb)
Continuing.

Breakpoint 2, read_kernel_setup () at bootloader.asm:75
75		xor	eax, eax
(gdb)
Continuing.

Breakpoint 3, check_version () at bootloader.asm:88
88		cmp	word [es:0x206], 0x204		; we need protocol version >= 2.04
(gdb)
Continuing.

Breakpoint 4, read_protected_mode_kernel () at bootloader.asm:105
105		mov	edx, [es:0x1f4]			; edx stores the number of bytes to load
(gdb)
Continuing.

Breakpoint 5, read_protected_mode_kernel_2 () at bootloader.asm:126
126		mov	eax, edx
(gdb)
Continuing.

Breakpoint 6, run_kernel () at bootloader.asm:142
142		cli
(gdb)

With the non-booting kernel, we get:

(gdb) c
Continuing.

Breakpoint 1, ?? () at bootloader.asm:48
48		cli
(gdb)
Continuing.

Breakpoint 2, read_kernel_setup () at bootloader.asm:75
75		xor	eax, eax
(gdb)
Continuing.

Breakpoint 3, check_version () at bootloader.asm:88
88		cmp	word [es:0x206], 0x204		; we need protocol version >= 2.04
(gdb)
Continuing.

Breakpoint 4, read_protected_mode_kernel () at bootloader.asm:105
105		mov	edx, [es:0x1f4]			; edx stores the number of bytes to load
(gdb)
Continuing.

Breakpoint 1, ?? () at bootloader.asm:48
48		cli
(gdb)

Okay! Now we see that the bootloader starts loading the kernel from disk into RAM, but doesn’t actually get far enough to call run_kernel, meaning the problem isn’t with stack protection, with loading a working command line or with anything inside the Linux kernel.

This lets us rule out a large part of the problem space. We now know that we can focus entirely on the bootloader and why it cannot load the Linux kernel into memory.

Let’s take a closer look…

Wait, this isn’t GDB!

In the example above, using breakpoints was sufficient to narrow down the problem.

You might think we used GDB, and it looked like this:

But that’s not GDB! It’s an easy mistake to make. After all, GDB starts up with just a text prompt, and as you can see from the example above, we can just enter text and achieve a good result.

To see the real GDB, you need to start it up fully, meaning including its user interface.

You can either use GDB’s text user interface (TUI), or a graphical user interface for gdb, such as the one available in Emacs.

The GDB text-mode user interface (TUI)

You’re already familiar with the architecture, target and breakpoint commands from above. To also set up the text-mode user interface, we run a few layout commands:

(gdb) set architecture i8086
(gdb) target remote localhost:1234
(gdb) symbol-file bootloader.elf
(gdb) layout split
(gdb) layout src
(gdb) layout regs
(gdb) break *0x7c00
(gdb) continue

The layout split command loads the text-mode user interface and splits the screen into a register window, disassembly window and command window.

With layout src we disregard the disassembly window in favor of a source listing window. Both are in assembly language in our case, but the source listing contains comments as well.

The layout src command also got rid of the register window, which we’ll get back using layout regs. I’m not sure if there’s an easier way.

The result looks like this:

The source window will highlight the next line of code that will be executed. On the left, the B+ marker indicates an enabled breakpoint, which will become helpful with multiple breakpoints. Whenever a register value changes, the register and its new value will be highlighted.

The up and down arrow keys scroll the source window.

Use C-x o to switch between the windows.

If you’re familiar with Emacs, you’ll recognize the keyboard shortcut. But as an Emacs user, you might prefer the GDB Emacs user interface:

The GDB Emacs user interface (M-x gdb)

This is M-x gdb with gdb-many-windows enabled:

Debugging the failing loop

Let’s take a look at the loop that we know the bootloader is entering, but not leaving (neither read_protected_mode_kernel_2 nor run_kernel are ever called):

read_protected_mode_kernel:
    mov  edx, [es:0x1f4]              ; edx stores the number of bytes to load
    shl  edx, 4

.loop:
    cmp  edx, 0
    je   run_kernel

    cmp  edx, 0xfe00                  ; less than 127*512 bytes remaining?
    jb   read_protected_mode_kernel_2

    mov  eax, 0x7f                    ; load 127 sectors (maximum)
    xor  bx, bx                       ; no offset
    mov  cx, 0x2000                   ; load temporary to 0x20000
    mov  esi, current_lba
    call read_from_hdd

    mov  cx, 0x7f00                   ; move 65024 bytes (127*512 byte)
    call do_move

    sub  edx, 0xfe00                  ; update the number of bytes to load
    add  word [gdt.dest], 0xfe00
    adc  byte [gdt.dest+2], 0
    jmp  short read_protected_mode_kernel.loop

The comments explain that the code loads chunks of FE00h == 65024 (127*512) bytes at a time.

Loading means calling read_from_hdd, then do_move. Let’s take a look at do_move:

do_move:
    push edx
    push es
    xor  ax, ax
    mov  es, ax
    mov  ah, 0x87
    mov  si, gdt
    int  0x15     ; line 182
    jc   error
    pop  es
    pop  edx
    ret

int 0x15 is a call to the BIOS Service Interrupt, which will dispatch the call based on AH == 87H to the Move Memory Block (techhelpmanual.com) function.

This function moves the specified amount of memory (65024 bytes in our case) from source/destination addresses specified in a Global Descriptor Table (GDT) record.

We can use GDB to show the addresses of each of do_move’s memory move calls by telling it to stop at line 182 (the int 0x15 instruction) and print the GDT record’s destination descriptor:

(gdb) break 182
Breakpoint 2 at 0x7d49: file bootloader.asm, line 176.

(gdb) command 2
Type commands for breakpoint(s) 2, one per line.
End with a line saying just "end".
>x/8bx gdt+24
>end

(gdb) continue
Continuing.

Breakpoint 1, ?? () at bootloader.asm:48
42		cli

(gdb)
Continuing.

Breakpoint 2, do_move () at bootloader.asm:182
182		int	0x15
0x7d85:	0xff	0xff	0x00	0x00	0x10	0x93	0x00	0x00

(gdb)
Continuing.

Breakpoint 2, do_move () at bootloader.asm:182
182		int	0x15
0x7d85:	0xff	0xff	0x00	0xfe	0x10	0x93	0x00	0x00

(gdb)

The destination address is stored in byte 2..4. Remember to read these little endian entries “back to front”.

Address #1 is 0x100000.
Address #2 is 0x10fe00.

If we press Return long enough, we eventually end up here:

Breakpoint 2, do_move () at bootloader.asm:182
182		int	0x15
0x7d85:	0xff	0xff	0x00	0x1e	0xff	0x93	0x00	0x00
(gdb)
Continuing.

Breakpoint 2, do_move () at bootloader.asm:182
182		int	0x15
0x7d85:	0xff	0xff	0x00	0x1c	0x00	0x93	0x00	0x00

(gdb)
Continuing.

Breakpoint 1, ?? () at bootloader.asm:48
42		cli
(gdb)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x000079b0 in ?? ()
(gdb)

Now that execution left the bootloader, let’s take a look at the last do_move call parameters: We notice that the destination address overflowed its 24 byte data type:

Address #y is 0xff1e00
Address #z is 0x001c00

Root cause

At this point I reached out to Sebastian again to ask him if there was an (undocumented) fundamental architectural limit to his Minimal Linux Bootloader — with 24 bit addresses, you can address at most 16 MB of memory.

He replied explaining that he didn’t know of this limit either! He then linked to Move Memory Block (techhelpmanual.com) as proof for the 24 bit limit.

Speculation

So, is it impossible to load larger kernels into memory from Real Mode? I’m not sure.

The current bootloader code prepares a GDT in which addresses are 24 bits long at most. But note that the techhelpmanual.com documentation that Sebastian referenced is apparently for the Intel 286 (a 16 bit CPU), and some of the GDT bytes are declared reserved.

Today’s CPUs are Intel 386-compatible (a 32 bit CPU), which seems to use one of the formerly reserved bytes to represent bit 24..31 of the address, meaning we might be able to pass 32 bit addresses to BIOS functions in a GDT after all!

I wasn’t able to find clear authoritative documentation on the Move Memory Block API on 386+, or whether BIOS functions in general are just expected to work with 32 bit addresses.

But Microsoft’s 1989 HIMEM.SYS source contains a struct that documents this 32-bit descriptor usage. A more modern reference is this Operating Systems Class from FAU 2023 (page 71/72).

Hence I’m thinking that most BIOS implementations should actually support 32 bit addresses for their Move Memory Block implementation — provided you fill the descriptor accordingly.

If that doesn’t work out, there’s also “Unreal Mode”, which allows using up to 4 GB in Real Mode, but is a change that is a lot more complicated. See also Julio Merino’s “Beyond the 1 MB barrier in DOS” post to get an idea of the amount of code needed.

Update: a fix!

Lobsters reader abbeyj pointed out that the following code change should fix the truncation and result in a GDT with all address bits in the right place:

--- i/mbr/bootloader.asm
+++ w/mbr/bootloader.asm
@@ -119,6 +119,7 @@ read_protected_mode_kernel:
 	sub	edx, 0xfe00			; update the number of bytes to load
 	add	word [gdt.dest], 0xfe00
 	adc	byte [gdt.dest+2], 0
+	adc	byte [gdt.dest+5], 0
 	jmp	short read_protected_mode_kernel.loop

 read_protected_mode_kernel_2:

…and indeed, in my first test this seems to fix the problem! It’ll take me a little while to clean this up and submit it. You can follow gokrazy issue #248 if you’re interested.

Bonus: reading BIOS source

There are actually a couple of BIOS implementations that we can look into to get a better understanding of how Move Memory Block works.

We can look at DOSBox, an open source DOS emulator. Its Move Memory Block implementation does seem to support 32 bit addresses:

PhysPt dest	= (mem_readd(data+0x1A) & 0x00FFFFFF) +
              (mem_readb(data+0x1E)<<24);

Another implementation is SeaBIOS. Contrary to DOSBox, SeaBIOS is not just used in emulation: The PC Engines apu uses coreboot with SeaBIOS. QEMU also uses SeaBIOS.

The SeaBIOS handle_1587 source code is a little harder to follow, because it requires knowledge of Real Mode assembly. The way I read it, SeaBIOS doesn’t truncate or otherwise modify the descriptors and just passes them to the CPU. On 386 or newer, 32 bit addresses should work.

Mitigation

While it’s great to understand the limitation we’re running into, I wanted to unblock the pull request as quickly as possible, so I needed a quick mitigation instead of investigating if my speculation can be developed into a proper fix.

When I started router7, we didn’t support loadable kernel modules, so everything had to be compiled into the kernel. We now do support loadable kernel modules, so I could have moved functionality into modules.

Instead, I found an even easier quick fix: switching from gzip to zstd compression. This saved about 1.8 MB and will buy us some time to implement a proper fix while unblocking automated new Linux kernel version merges.

Conclusion

I wanted to share this debugging story because it shows a couple of interesting lessons:

Being able to run older versions of various parts of your software stack is a very valuable debugging tool. It helped us isolate a trigger for the bug (using an older GCC) and it helped us set up a debugging environment (using an older QEMU).
Setting up a debugger can be annoying (symbol files, learning the UI) but it’s so worth it.
Be on the lookout for wrong turns during debugging. Write down every conclusion and challenge it.
The BIOS can seem mysterious and “too low level” but there are many blog posts, lectures and tutorials. You can also just read open-source BIOS code to understand it much better.

Enjoy poking at your BIOS!

Appendix: Resources

I found the following resources helpful:

When a service fails to start up enough times in a row, systemd gives up on it.

On servers, this isn’t what I want — in general it’s helpful for automated recovery if daemons are restarted indefinitely. As long as you don’t have circular dependencies between services, all your services will eventually come up after transient failures, without having to specify dependencies.

This is particularly useful because specifying dependencies on the systemd level introduces footguns: when interactively stopping individual services, systemd also stops the dependents. And then you need to remember to restart the dependent services later, which is easy to forget.

Enabling indefinite restarts for a service

To make systemd restart a service indefinitely, I first like to create a drop-in config file like so:

cat > /etc/systemd/system/restart-drop-in.conf <<'EOT'
[Unit]
StartLimitIntervalSec=0

[Service]
Restart=always
RestartSec=1s
EOT

Then, I can enable the restart behavior for individual services like prometheus-node-exporter, without having to modify their .service files (which needs manual effort when updating):

cd /etc/systemd/system
mkdir prometheus-node-exporter.service.d
cd prometheus-node-exporter.service.d
ln -s ../restart-drop-in.conf
systemctl daemon-reload

Changing the defaults for all services

If most of your services set Restart=always or Restart=on-failure, you can change the system-wide defaults for RestartSec and StartLimitIntervalSec like so:

mkdir /etc/systemd/system.conf.d
cat > /etc/systemd/system.conf.d/restartdefaults.conf <<'EOT'
[Manager]
DefaultRestartSec=1s
DefaultStartLimitIntervalSec=0
EOT
systemctl daemon-reload

What do the default settings do?

So why do we need to change these settings to begin with?

The default systemd settings (as of systemd 255) are:

DefaultRestartSec=100ms
DefaultStartLimitIntervalSec=10s
DefaultStartLimitBurst=5

This means that services which specify Restart=always are restarted 100ms after they crash, and if the service crashes more than 5 times in 10 seconds, systemd does not attempt to restart the service anymore.

It’s easy to see that for a service which takes, say, 100ms to crash, for example because it can’t bind on its listening IP address, this means:

time	event
T+0	first start
T+100ms	first crash
T+200ms	second start
T+300ms	second crash
T+400ms	third start
T+500ms	third crash
T+600ms	fourth start
T+700ms	fourth crash
T+800ms	fifth start
T+900ms	fifth crash within 10s
T+1s	systemd gives up

Why does systemd give up by default?

I’m not sure. If I had to speculate, I would guess the developers wanted to prevent laptops running out of battery too quickly because one CPU core is permanently busy just restarting some service that’s crashing in a tight loop.

That same goal could be achieved with a more relaxed DefaultRestartSec= value, though: With DefaultRestartSec=5s, for example, we would sufficiently space out these crashes over time.

There is some recent discussion upstream regarding changing the default. Let’s see where the discussion goes.

[2024-01-13: I added a section with an option I forgot to put into my talk and thus elided from the initial post as well.]

I gave a talk at GopherConAU 2023 about a particular problem we encountered when designing generics for Go and what we might do about it.

This blog post is meant as a supplement to that talk. It mostly reproduces its content, while giving some supplementary information and more detailed explanations where necessary.

So if you prefer to ingest your information from text, then this blog post should serve you well. If you prefer a talk, you can watch the recording and use it to get some additional details in the relevant sections.

The talk (and hence this post) is also a follow-up to a previous blog post of mine. But I believe the particular explanation I give here should be a bit more approachable and is also more general. If you have read that post and are just interested in the differences, feel free to skip to the Type Parameter Problem.

With all that out of the way, let us get into it.

The Problem

If you are using Go generics, you are probably aware that it’s possible to constrain type parameters. This makes sure that a type argument has all the operations that your generic function expects available to it.

One particular way to constrain a type parameter is using union elements, which allow you to say that a type has to be from some list of types. The most common use of this is to allow you to use Go’s operators on a generic parameter:

// Allows any type argument that has underlying type int, uint or string.
type Ordered interface {
    ~int | ~uint | ~string
}

func Max[T Ordered](a, b T) T {
    // As all int, uint and string types support the > operator, our generic
    // function can use it:
    if a > b {
        return a
    }
    return b
}

Another case this would be very useful for would be to allow us to call a method as a fallback:

type Stringish interface {
    fmt.Stringer | ~string
}

func Stringify[T Stringish](v T) string {
    if s, ok := any(v).(fmt.Stringer); ok {
        return s.String()
    }
    return reflect.ValueOf(v).String()
}

However, if we try this, the compiler will complain:

cannot use fmt.Stringer in union (fmt.Stringer contains methods)

And if we check the spec, we find a specific exception for this:

Implementation restriction: A union (with more than one term) cannot contain the predeclared identifier comparable or interfaces that specify methods, or embed comparable or interfaces that specify methods.

To explain why this restriction is in place, we will dive into a bit of theory.

Some Theory

You have probably heard about the P versus NP problem. It concerns two particular classes of computational problems:

P is the class of problems that can be solved efficiently¹. An example of this is multiplication of integers: If I give you two integers, you can write an algorithm that quickly multiplies them.
NP is the class of problems that can be verified efficiently: If you have a candidate for a solution, you can write an efficient algorithm that verifies it. An example is factorization: If you give me an integer $N$ and a prime $p$, you can efficiently check whether or not it is a factor of $N$. You just divide $N$ by $p$ and check whether there is any remainder.

Every problem in P is also in NP: If you can efficiently solve a problem, you can also easily verify a solution, by just doing it yourself and comparing the answers.

However, the opposite is not necessarily true. For example, if I give you an integer $N$ and tell you to give me a non-trivial factor of it, the best you could probably do is try out all possible candidates until you find one. This is exponential in the size of the input (an integer with $k$ digits has on the order of $10^k$ candidate factors).

We generally assume that there are in fact problems which are in NP but not in P - but we have not actually proven so. Doing that is the P versus NP problem.

While we have not proven that there are such “hard” problems, we did prove that there are some problems which are “at least as hard as any other problem in NP”. This means that if you can solve them efficiently, you can solve any problem in NP efficiently. These are called “NP-hard” or “NP-complete”².

One such problem is the Boolean Satisfiability Problem. It asks you to take in a boolean formula - a composition of some boolean variables, connected with “and”, “or” and “not” operators - and determine an assignment to the variables that makes the formula true.

So, for example, I could ask you to find me a satisfying assignment for this function:

func F(x, y, z bool) bool {
    return (!x || z) && (y || z) && (x || !z)
}

For example, F(true, true, false) is false, so it is not a satisfying assignment. But F(false, true, false) is true, so that is a satisfying assignment.

It is easy to verify whether any given assignment satisfies your formula - you just substitute all the variables and evaluate it. But to find one, you probably have to try out all possible inputs. And for $n$ variables, you have $2^n$ different options, so this takes exponential time.

In practice, this means that if you can show that solving a particular problem would allow you to solve SAT, your problem is itself NP-hard: It would be at least as hard as solving SAT, which is at least as hard as solving any other NP problem. And as we assume that NP≠P, this means your problem can probably not be solved efficiently.

The last thing we need to mention is co-NP, the class of complements of problems in NP. The complement of a (decision) problem is simply the same problem, with the answer is inverted: You have to answer “yes” instead of “no” and vice versa. And where with NP, a “yes” answer should have an efficiently verifiable proof, with co-NP, a “no” answer should have an efficiently verifiable proof.

Notably, the actual difficulty of solving the problem does not change. To decide between “yes” and “no” is just as hard, you just turn around the answer. So, in a way, this is a technicality.

A co-NP complete problem is simply a problem that is the complement of an NP complete problem and as you would expect, it is just as hard and it is at least as hard as any other problem in co-NP.

Now, with the theory out of the way, let’s look at Go again.

The Type Parameter Problem

When building a Go program, the compiler has to solve a couple of computational problems as well. For example, it has to be able to answer “does a given type argument satisfy a given constraint”. This happens if you instantiate a generic function with a concrete type:

func F[T C]() {} // where C is some constraint
func G() {
    F[int]() // Allowed if and only if int satisfies C.
}

This problem is in P: The compiler can just evaluate the constraint as if it was a logical formula, with | being an “or” operator, multiple lines being an “and” operator and checking if the type argument has the right methods or underlying types on the way.

Another problem it has to be able to solve is whether a given constraint C1 implies another constraint C2: Does every type satisfying C1 also satisfy C2? This comes up if you instantiate a generic function with a type parameter:

func F[T C1]() {
    G[T]() // Allowed if and only if C1 implies C2
}
func G[T C2]() {}

My claim now is that this problem (which I will call the “Type Parameter Problem” for the purposes of this post) is co-NP complete³.

To prove this claim, we reduce SAT to the (complement of the) Type Parameter Problem. We show that if we had a Go compiler which solves this problem, we can use it so solve the SAT problem as well. And we do that, by translating an arbitrary boolean formula into a Go program and then check whether it compiles.

On a technical note, we are going to assume that the fomula is in Conjunctive Normal Form (CNF): A list of terms connected with “and” operators, where each term is a list of (possibly negated) variables connected with “or” terms. The example I used above is in CNF and we use it as an example to demonstrate the translation:

func F(x, y, z bool) bool {
    return (!x || z) && (y || z) && (x || !z)
}

This assumption may seem like a cheat, but importantly, SAT is still NP-complete with it.

The first step in our reduction is to model our boolean variables. Every variable can be either true or false and it can appear negated or not negated. We encode that by defining two interfaces per variable³:

type X interface { X() }     // X is assigned "true"
type NotX interface{ NotX()} // X is assigned "false"

This allows us to translate our formula directly, using union elements for “or” and interface-embedding for “and”:

// Represents (!x || z) && (y || z) && (x || !z)
type Formula interface {
    NotX | Z
    Y | Z
    X | NotZ
}

There are, however, two issues with this:

A type could have neither of X() and NotX().
A type could have both of X() and NotX().

This breaks our representation, because a boolean variable always has to be exactly true or false - it can’t be neither and it can’t be both.

To address the first point, we define another interface:

type AtLeastOne interface {
    X | NotX
    Y | NotY
    Z | NotZ
}

Any type satisfying AtLeastOne has to assign at least one of true and false to each variable.

Similarly, we define an interface to address the second problem:

type Both_X interface { X; NotX }
type Both_Y interface { Y; NotY }
type Both_Z interface { Z; NotZ }
type Both interface {
    Both_X | Both_Y | Both_Z
}

Any type satisfying Both now assigns both true and false to at least one variable.

To represent a valid, satisfying assignment, a type thus has to

satisfy Formula
satisfy AtLeastOne
not satisfy Both

Now, we ask our compiler to type-check this Go program⁴:

func G[T Both]() {}
func F[T interface{ Formula; AtLeastOne }]() {
    G[T]() // Allowed if and only if (Formula && AtLeastOne) => Both
}

This program should compile, if and only if any type satisfying Formula and AtLeastOne also satisfies Both. Because we are looking at the complement of SAT, we invert this, to get our final answer:

    !( (Formula && AtLeastOne) =>  Both )
<=> !(!(Formula && AtLeastOne) ||  Both ) // "A => B" is equivalent to "!A || B"
<=> !(!(Formula && AtLeastOne  && !Both)) // De Morgan's law
<=>     Formula && AtLeastOne  && !Both   // Double negation

This finishes our reduction: The compiler should reject the program, if and only if the formula has a satisfying assignment. The Type Parameter Problem is at least as hard as the complement of SAT.

Going forward

So the restriction on methods in union elements is in place, because we are concerned about type checking Go would become a very hard problem if we allowed them. But that is, of course, a deeply dissatisfying situation.

Our Stringish example would clearly be a very useful constraint - so useful, in fact, that it was used an example in the original design doc. More generally, this restriction prevents us from having a good way to express operator constraints for generic functions and types. We currently end up writing multiple versions of the same functions, one that uses operators and one that takes functions to do the operations. This leads to boilerplate and extra API surface⁵.

The slices package contains a bunch of examples like that (look for the Func suffix to the name):

// Uses the == operator. Useful for predeclared types (int, string,…) and
// structs/arrays of those.
func Contains[S ~[]E, E comparable](s S, v E) bool
// Uses f. Needed for slices, maps, comparing by pointer-value or other notions
// of equality.
func ContainsFunc[S ~[]E, E any](s S, f func(E) bool) bool

So we should consider compromises, allowing us to get some of the power of removing this restriction at least.

Option 1: Ignore the problem

This might be a surprising option to consider after spending all these words on demonstrating that this problem is hard to solve, but we can at least consider it: We simply say that a Go compiler has to include some form of (possibly limited) SAT solver and is allowed to just give up after some time, if it can not find a proof that a program is safe.

C++ concepts do this. A C++ compiler has to determine if one constraint implies another one, when it has to decide which of multiple overloaded generic functions to invoke. And it does so using a simple SAT solver. In particular, if it wants to prove $P ⇒ Q$, it first converts $P$ into Disjunctive Normal Form (DNF) and then convert $Q$ into Conjunctive Normal Form (CNF).

With $P$ in DNF and $Q$ in CNF, $P ⇒ Q$ is easy to prove (and disprove). But this normalization into DNF or CNF itself requires exponential time in general. And you can indeed create C++ programs that crash C++ compilers.

Personally, I find all versions of this option very dissatisfying:

Leaving the heuristic up to the implementation feels like too much wiggle-room for what makes a valid Go program.
Describing an explicit heuristic in the spec takes up a lot of the complexity budget of the spec.
Allowing the compiler to try and give up after some time feels antithetical to the pride Go takes in fast compilation.

Option 2: Limit the expressiveness of interfaces

For the interfaces as they exist today, we actually can solve the SAT problem: Any interface can ultimately be represented in the form (with some elements perhaps being empty):

interface {
    A | … | C | ~X | … | ~Z // for some concrete types
    comparable
    M1(…) (…)
    // …
    Mn(…) (…)
}

And it is straight-forward to use this representation to do the kind of inference we need.

This tells us that there are some restrictions we can put on the kinds of interfaces we can write down, while still not running into the kinds of problems discussed in this post. That’s because every such kind of interfaces gives us a restricted sub problem of SAT, which only looks at formulas conforming to some extra restrictions.

One example of such a sub problem we actually used above, where we assumed that our formula is in Conjunctive Normal Form. Another important such sub problem is the one where the formulas are in Disjunctive Normal Form instead: Where we have a list of terms linked with “or” operators and each term is a list of (possibly negated) variables linked with “and” operators. For DNF, the SAT problem is efficiently solvable.

We could take advantage of that by allowing union elements to contain methods - but only if

There is exactly one union in the top-level interface.
The interfaces embedded in that union are “easy” interfaces, i.e. ones we allow today.

So, for example

type Stringish interface {
    // Allowed: fmt.Stringer and ~string are both allowed today
    fmt.Stringer | ~string
}
type A interface {
    // Not Allowed: Stringish is not allowed today, so we have more than one level
    Stringish | ~int
}
type B interface {
    // Allowed: Same as A, but we "flattened" it, so each element is an
    // "easy" interface.
    fmt.Stringer | ~string | ~int
}
type C interface {
    // Not Allowed: Can only have a single union (or must be an "easy" interface)
    fmt.Stringer | ~string
    comparable
}

This restriction makes our interfaces be in DNF, in a sense. It’s just that every “variable” of our DNF is itself an “easy” interface. If we need to solve SAT for one of these, we first solve it on the SAT formula to determine which “easy” interfaces need to be satisfied and then use our current algorithms to check which of those can be satisfied.

Of course, this restriction is somewhat hard to explain. But it would allow us to write at least some of the useful programs we want to use this feature for. And we might find another set of restrictions that are easier to explain but still allow that.

We should probably try to collect some useful programs that we would want to write with this feature and then see, for some restricted interface languages if they allow us to write them.

Option 3: Make the type-checker conservative

For our reduction, we assumed that the compiler should allow the program if and only if it can prove that every type satisfying C1 also satisfies C2.

We could allow it to reject some programs that would be valid, though. Wec could describe an algorithm for determining if C1 implies C2 that can have false negatives: Rejecting a theoretically safe program, just because it cannot prove that it is safe with that algorithm, requiring you to re-write your program into something it can handle more easily.

Ultimately, this is kind of what a type system does: It gives you a somewhat limited language to write a proof to the compiler that your program is “safe”, in the sense that it satisfies certain invariants. And if you accidentally pass a variable of the wrong type - even if your program would still be perfectly valid - you might have to add a conversion or call some function that verifies its invariants, before being allowed to do so.

For this route, we still have to decide which false negatives we are willing to accept though: What is the algorithm the compiler should use?

For some cases, this is trivial. For example, this should obviously compile:

func StringifyAll[T Stringish](vals ...T) []string {
    out := make([]string, len(vals))
    for i, v := range vals {
        // Stringify as above. Should be allowed, as T uses the same constraint
        // as Stringify.
        out[i] = Stringify(v)
    }
    return out
}

But other cases are not as straight forward and require some elaboration:

func Marshal[T Stringish | ~bool | constraints.Integer](v T) string { /* … */ }

// Stringish appears in the union of the target constraint.
func F[T Stringish](v T) string { return Marshal[T](v) }

// string has underlying type string and fmt.Stringer is the Stringish union.
func G[T string|fmt.Stringer](v T) string { return Marshal[T](v) }

// The method name is just a different representation of fmt.Stringer
func H[T interface{ String() string }](v T) string { return Marshal[T](v) }

These examples are still simple, but they are useful, so should probably be allowed. But they already show that there is somewhat complex inference needed: Some terms on the left might satisfy some terms on the right, but we can not simply compare them as a subset relation, we actually have to take into account the different cases.

And remember that converting to DNF or CNF takes exponential time, so the simple answer of “convert the left side into DNF and the right side into CNF, then check each term individually” does not solve our problem.

In practice, this option has a large intersection with the previous one: The algorithm would probably reject programs that use interfaces with too complex a structure on either side, to guarantee that it terminates quickly. But it would allow us, in principle, to use different restrictions for the left and the right hand side: Allow you to write any interface and only check the structure if you actually use them in a way that would make inference impossible.

We have to decide whether we would find that acceptable though, or whether it seems to confusing in practice. Describing the algorithm also would take quite a lot of space and complexity budget in the spec.

Option 4: Delay constraint checking until instantiation

One option I forgot to bring up in my talk is essentially the opposite of the previous one: We could have the compiler skip checking the constraints of generic function calls in generic functions altogether. So, for example, this code would be valid:

func G[T fmt.Stringer](v T) string {
    return v.String()
}

func F[T any](v T) string {
    // T constrained on any does not satisfy fmt.Stringer.
    // But we allow the call anyways, for now.
    return G(v)
}

To retain type-safety, we would instead check the constraints only when F is instantiated with a concrete type:

func main() {
    F(time.Second) // Valid: time.Duration implements fmt.Stringer
    F(42)          // Invalid: int does not implement fmt.Stringer
}

The upside is that this seems very easy to implement. It means we completely ignore any questions that require us to do inference on “sets of all types”. We only ever need to answer whether a specific type satisfies a specific constraint. Which we know we can do efficiently.

The downside is that this effectively introduces new constraints on the type-parameter of F implicitly. The signature says that F can be instantiated with any type, but it actually requires a fmt.Stringer.

One consequence of that is that it becomes harder to figure out what type arguments are allowed for a generic function or type. An instantiation might fail and the only way to understand why is to look into the code of the function you are calling. Potentially multiple layers of dependency deep.

Another consequence is that it means your program might break because of a seemingly innocuous change in a dependency. A library author might add a generic call to one of their functions. Because it only changes the implementation and not the API, they assume that this is a backwards compatible change. Their tests pass, because none of the types they use in their tests triggers this change in behavior. So they release a new minor version of their library. Then you upgrade (perhaps by way of upgrading another library that also depends on it) and your code no longer compiles, because you use a different type - conforming to the actual constraints from the signature, but not the implicit ones several layers of dependency down.

Because of this breakage in encapsulation, Go generics have so far eschewed this idea of delayed constraint checking. But it is possible that we could find a compromise here: Check the most common and easy to handle cases statically, while delaying some of the more complex and uncommon ones until instantiation. Where to draw that line would then be open for discussion.

Personally, just like with Option 1, I dislike this idea. But we should keep it in mind.

Future-proofing

Lastly, when we talk about this we should keep in mind possible future extensions to the generics design.

For example, there is a proposal by Rog Peppe to add a type-switch on type parameters. The proposal is to add a new type switch syntax for type parameters, where every case has a new constraint and in that branch, you could use the type parameter as if it was further constrained by that. So, for example, it would allow us to rewrite Stringify without reflect:

func Stringify[T Stringish](v T) string {
    switch type T {
    case fmt.Stringer:
        // T is constrained by Stringish *and* fmt.Stringer. So just fmt.Stringer
        // Calling String on a fmt.Stringer is allowed.
        return v.String()
    case ~string:
        // T is consrtained by Stringish *and* ~string. So just ~string
        // Converting a ~string to string is allowed.
        return string(v)
    }
}

The crux here is, that this proposal allows us to create new, implicit interfaces out of old ones.

If we restrict the structure of our interfaces, these implicit interfaces might violate this structure. And if we make the type checker more conservative, a valid piece of code might no longer be valid if copied into a type parameter switch, if the implicit constraints would lead to a generic all the compiler can’t prove to be safe.

Of course it is impossible to know what extension we really want to add in the future. But we should at least consider some likely candidates during the discussion.

Summary

I hope I convinced you that

Simply allowing methods in unions would make type-checking Go code co-NP hard.
But we might be able to find some compromise that still allows us to do some of the things we want to use this for.
The devil is in the details and we still have to think hard and carefully about those.

“efficient”, in this context, means “in polynomial time in the size of the input”.

In general, if an input to an algorithm gets larger, the time it needs to run grows. We can look at how fast this growth is, how long the algorithm takes by the size of the input. And if that growth is at most polynomial, we consider that “efficient”, in this context.

In practice, even many polynomial growth functions are too slow for our taste. But we still make this qualitative distinction in complexity theory. ↩︎
The difference between these two terms is that “NP-hard” means “at least as difficult than any problem in NP”. While “NP-complete” means “NP-hard and also itself in NP”.

So an NP-hard problem might indeed be even harder than other problems in NP, while an NP-complete problem is not.

For us, the difference does not really matter. All problems we talk about are in NP. ↩︎
If you have read my previous post on the topic, you might notice a difference here. Previously, I defined NotX as interface{ X() int } and relied on this being mutually exclusive with X: You can’t have two methods with the same name but different signatures.

This is one reason I think this proof is nicer than my previous one. It does not require “magical” knowledge like that, instead only requiring you to be able to define interfaces with arbitrary method names. Which is extremely open. ↩︎
The other reason I like this proof better than my previous one is that it no longer relies on the abstract problem of “proving that a type set is empty”. While the principle of explosion is familiar to Mathematicians, it is hard to take its implications seriously if you are not.

Needing to type-check a generic function call is far more obvious as a problem that needs solving and it is easier to find understandable examples. ↩︎
And inefficiencies, as calling a method on a type parameter can often be devirtualized and/or inlined. A func value sometimes can’t. For example if it is stored in a field of a generic type, the compiler is usually unable to prove that it doesn’t change at runtime. ↩︎

For over 10 years now, I run two self-built NAS (Network Storage) devices which serve media (currently via Jellyfin) and run daily backups of all my PCs and servers.

In this article, I describe my goals, which hardware I picked for my new build (and why) and how I set it up.

Design Goals

I use my network storage devices primarily for archival (daily backups), and secondarily as a media server.

There are days when I don’t consume any media (TV series and movies) from my NAS, because I have my music collection mirrored to another server that’s running 24/7 anyway. In total, my NAS runs for a few hours in some evenings, and for about an hour (daily backups) in the mornings.

This usage pattern is distinctly different than, for example, running a NAS as a file server for collaborative video editing that needs to be available 24/7.

The goals of my NAS setup are:

Save power: each NAS build only runs when needed.
- They must support Wake-on-LAN or similar (ESP32 remote power button).
- Scheduling of backups is done separately, on a Raspberry Pi with gokrazy.
- Convenient power off (tied to our all-lights-out button) and power on (with webwake).
Use Off-the-shelf hardware and software.
- When hardware breaks, I can get replacements from the local PC store the same day.
- Even when only the data disk(s) survive, I should be able to access my data when booting a standard live Linux system.
- Minimal application software risk: I want to minimize risk for manual screw-ups or software bugs, meaning I use the venerable rsync for my backup needs (not Borg, restic, or similar).
- Minimal system software risk: I use reliable file systems with the minimal feature set — no LVM or btrfs snapshots, no ZFS replication, etc. To achieve redundancy, I don’t use a cluster file system with replication, instead I synchronize my two NAS builds using rsync, without the --delete flag.
Minimal failure domains: when one NAS fails, the other one keeps working.
- Having N+1 redundancy here takes the stress out of repairing your NAS.
- I run each NAS in a separate room, so that accidents like fires or spilled drinks only affect one machine.

File System: ZFS

In this specific build, I am trying out ZFS. Because I have two NAS builds running, it is easy to change one variable of the system (which file system to use) in one build, without affecting the other build.

My main motivation for using ZFS instead of ext4 is that ZFS does data checksumming, whereas ext4 only checksums metadata and the journal, but not data at rest. With large enough datasets, the chance of bit flips increases significantly, and I would prefer to know about them so that I can restore the affected files from another copy.

Hardware

Each of the two storage builds has (almost) the same components. This makes it easy to diagnose one with the help of the other. When needed, I can swap out components of the second build to temporarily repair the first one, or vice versa.

photo of the Network Storage PC from the side, showing the Noctua case fan and CPU cooler, data disks, PSU and cables

Base Components

Price	Type	Article	Remark
114 CHF	mainboard	AsRock B450 Gaming ITX/ac	Mini ITX
80 CHF	cpu	AMD Athlon 3000G	35W TDP, GPU
65 CHF	cpu cooler	Noctua NH-L12S	silent!
58 CHF	power supply	Silverstone ST30SF 300W SFX	SFX form factor
51 CHF	case	Silverstone SST-SG05BB-Lite	Mini ITX
48 CHF	system disk	WD Red SN700 250GB	M.2 NVMe
32 CHF	case fan	Noctua NF-S12A ULN	silent 120mm
28 CHF	ram	8 GB DDR4 Value RAM (F4-2400C15-8GNT)

The total price of 476 CHF makes this not a cheap build.

But, I think each component is well worth its price. Here’s my thinking regarding the components:

Why not a cheaper system disk? I wanted to use an M.2 NVMe disk so that I could mount it on the bottom of the mainboard instead of having to mount another SATA disk in the already-crowded case. Instead of chosing the cheapest M.2 disk I could find, I went with WD Red as a brand I recognize. While it’s not a lot of effort to re-install the system disk, it’s still annoying and something I want to avoid if possible. If spending 20 bucks saves me one disk swap + re-install, that’s well worth it for me!
Why not skip the system disk entirely and install on the data disks? That makes the system harder to (re-)install, and easier to make manual errors when recovering the system. I like to physically disconnect the data disks while re-installing a NAS, for example. (I’m a fan of simple precautions that prevent drastic mistakes!)
Why not a cheaper CPU cooler? In one of my earlier NAS builds, I used a (cheaper) passive CPU fan, which was directly in the air stream of the Noctua 120mm case fan. This setup was spec’ed for the CPU I used, and yet said CPU died as the only CPU to die on me in many many years. I want a reliable CPU fan, but also an absolutely silent build, so I went with the Noctua CPU cooler.
Why not skip the case fan, or go with the Silverstone-supplied one? You might argue that the airflow of the CPU cooler is sufficient for this entire build. Maybe that’s true, but I don’t want to risk it. Also, there are 3 disks (two data disks and one system disk) that can benefit from additional airflow.
Regarding the CPU, I chose the cheapest AMD CPU for Socket AM4, with a 35W TDP and built-in graphics. The built-in graphics means I can connect an HDMI monitor for setup and troubleshooting, without having to use the mainboard’s valuable one and only PCIe slot.

Unfortunately, AMD CPUs with 35W TDP are not readily available right now. My tip is to look around for a bit, and maybe buy a used one. Chose either the predecessor Athlon 200GE, or the newer generation Ryzen APU series, whichever you can get your hands on.
Regarding the mainboard, I went with the AsRock Mini ITX series, which have served me well over the years. I started with an AsRock AM1H-ITX in 2016, then bought two AsRock AB350 Gaming ITX/ac in 2019, and recently an AsRock B450 Gaming ITX/ac.

As a disclaimer: the two builds I use are very similar to the component list above, with the following differences:

On storage2, I use an old AMD Ryzen 5 5600X CPU instead of the listed Athlon 3000G. The extra performance isn’t needed, and the lack of integrated graphics is annoying. But, I had the CPU lying around and didn’t want it to go to waste.
On storage3, I use an old AMD Athlon 200GE CPU on an AsRock AB350 mainboard.

I didn’t describe the exact builds I use because a component list is more useful if the components on it are actually available :-).

16 TB SSD Data Disks

It used to be that Solid State Drives (SSDs) were just way too expensive compared to spinning hard disks when talking about terabyte sizes, so I used to put the largest single disk drive I could find into each NAS build: I started with 8 TB disks, then upgraded to 16 TB disks later.

Luckily, the price of flash storage has come down quite a bit: the Samsung SSD 870 QVO (8 TB) costs “only” 42 CHF per TB. For a total of 658 CHF, I can get 16 TB of flash storage in 2 drives:

Of course, spinning hard disks are at 16 CHF per TB, so going all-flash is over 3x as expensive.

I decided to pay the premium to get a number of benefits:

My NAS devices are quieter because there are no more spinning disks in them. This gives me more flexibility in where to physically locate each storage machine.
My daily backups run quicker, meaning each NAS needs to be powered on for less time. The effect was actually quite pronounced, because figuring out which files need backing up requires a lot of random disk access. My backups used to take about 1 hour, and now finish in less than 20 minutes.
The quick access times of SSDs solve the last remaining wrinkle in my backup scheme: deleting backups and measuring used disk space is finally fast!

Power Usage

The choice of CPU, Mainboard and Network Card all influence the total power usage of the system. Here are a couple of measurements to give you a rough idea of the power usage:

build	CPU	main board	network card	idle	load
s2	5600X	B450	10G: Mellanox ConnectX-3	26W	60W
s3	200GE	AB350	10G: FS Intel 82599	28W	50W
s3	200GE	AB350	1G onboard	23W	40W

These values were measured using a myStrom WiFi Switch.

Operating System

Previously: CoreOS

Before this build, I ran my NAS using Docker containers on CoreOS (later renamed to Container Linux), which was a light-weight Linux distribution focused on containers. There are two parts about CoreOS that I liked most.

The most important part was that CoreOS updated automatically, using an A/B updating scheme, just like I do in gokrazy. I want to run as many of my devices as possible with A/B updates.

The other bit I like is that the configuration is very clearly separated from the OS. I managed the configuration (a cloud-init YAML file) on my main PC, so when swapping out the NAS system disk with a blank disk, I could just plug my config file into the CoreOS installer, and be done.

When CoreOS was bought by Red Hat and merged into Project Atomic, there wasn’t a good migration path and cloud-init wasn’t supported anymore. As a short-term solution, I switched from CoreOS to Flatcar Linux, a spiritual successor.

Now: Ubuntu Server

For this build, I wanted to try out ZFS. I always got the impression that ZFS was a pain to run because its kernel modules are not included in the upstream Linux kernel source.

Then, in 2016, Ubuntu decided to include ZFS by default. There are a couple of other Linux distributions on which ZFS seems easy enough to run, like Gentoo, Arch Linux or NixOS.

I wanted to spend my “innovation tokens” on ZFS, and keep the rest boring and similar to what I already know and work with, so I chose Ubuntu Server over NixOS. It’s similar enough to Debian that I don’t need to re-learn.

Luckily, the migration path from Flatcar’s cloud-init config to Ubuntu Server is really easy: just copy over parts of the cloud-config until you’re through the entire thing. It’s like a checklist!

Maybe later? gokrazy

In the future, it might be interesting to build a NAS setup using gokrazy. In particular since we now can run Docker containers on gokrazy, which makes running Samba or Jellyfin quite easy!

Using gokrazy instead of Ubuntu Server would get rid of a lot of moving parts. The current blocker is that ZFS is not available on gokrazy. Unfortunately that’s not easy to change, in particular also from a licensing perspective.

Setup

UEFI

I changed the following UEFI settings:

Advanced → ACPI Configuration → PCIE Devices Power On: Enabled
- This setting is needed (but not sufficient) for Wake On LAN (WOL). You also need to enable WOL in your operating system.
Advanced → Onboard Devices Configuration → Restore on AC/Power Loss: Power On
- This setting ensures the machine turns back on after a power loss. Without it, WOL might not work after a power loss.

Operating System

Network preparation

I like to configure static IP addresses for devices that are a permanent part of my network.

I have come to prefer configuring static addresses as static DHCP leases in my router, because then the address remains the same no matter which operating system I boot — whether it’s the installed one, or a live USB stick for debugging.

Ubuntu Server

Download Ubuntu Server from https://ubuntu.com/download/server
- I initially let the setup program install Docker, but that’s a mistake. The setup program will get you Docker from snap (not apt), which can’t work with the whole file system.
Disable swap:
- swapoff -a
- $EDITOR /etc/fstab # delete the swap line
Automatically load the corresponding sensors kernel module for the mainboard so that the Prometheus node exporter picks up temperature values and fan speed values:
- echo nct6775 | sudo tee /etc/modules

Enable unattended upgrades:

dpkg-reconfigure -plow unattended-upgrades

Edit /etc/apt/apt.conf.d/50unattended-upgrades — I like to make the following changes:

Unattended-Upgrade::MinimalSteps "true";
Unattended-Upgrade::Mail "michael@example.net";
Unattended-Upgrade::MailReport "only-on-error";
Unattended-Upgrade::Automatic-Reboot "true";
Unattended-Upgrade::Automatic-Reboot-Time "08:00";
Unattended-Upgrade::SyslogEnable "true";

Network

Tailscale Mesh VPN

I have come to like Tailscale. It’s a mesh VPN (data flows directly between the machines) that allows me access to and from my PCs, servers and storage machines from anywhere.

Specifically, I followed the install Tailscale on Ubuntu 22.04 guide.

Prometheus Node Exporter

For monitoring, I have an existing Prometheus setup. To add a new machine to my setup, I need to configure it as a new target on my Prometheus server. In addition, I need to set up Prometheus on the new machine.

First, I installed the Prometheus node exporter using apt install prometheus-node-exporter.

Then, I modified /etc/default/prometheus-node-exporter to only listen on the Tailscale IP address:

ARGS="--web.listen-address=100.85.3.16:9100"

Lastly, I added a systemd override to ensure the node exporter keeps trying to start until tailscale is up: the command systemctl edit prometheus-node-exporter opens an editor, and I configured the override like so:

# /etc/systemd/system/prometheus-node-exporter.service.d/override.conf
[Unit]
# Allow infinite restarts, even within a short time.
StartLimitIntervalSec=0

[Service]
RestartSec=1

Static IPv6 address

Similar to the static IPv4 address, I like to give my NAS a static IPv6 address as well. This way, I don’t need to reconfigure remote systems when I (sometimes temporarily) switch my NAS to a different network card with a different MAC address. Of course, this point becomes moot if I ever switch all my backups to Tailscale.

Ubuntu Server comes with Netplan by default, but I don’t know Netplan and don’t want to use it.

To switch to systemd-networkd, I ran:

apt remove --purge netplan.io
systemctl enable --now systemd-networkd

Then, I created a systemd-networkd config file with a static IPv6 token, resulting in a predictable IPv6 address:

$EDITOR /etc/systemd/network/enp.network

My config file looks like this:

[Match]
Name=enp*

[Network]
DHCP=yes
IPv6Token=0:0:0:0:10::253
IPv6AcceptRouterAdvertisements=yes

IPv6 firewall setup

An easy way to configure Linux’s netfilter firewall is to apt install iptables-persistent. That package takes care of saving firewall rules on shutdown and restoring them on the next system boot.

My rule setup is very simple: allow ICMP (IPv6 needs it), then set up ACCEPT rules for the traffic I expect, and DROP the rest.

Here’s my resulting /etc/iptables/rules.v6 from such a setup:

/etc/iptables/rules.v6

# Generated by ip6tables-save v1.4.14 on Fri Aug 26 19:57:51 2016
*filter
:INPUT DROP [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -p ipv6-icmp -m comment --comment "IPv6 needs ICMPv6 to work" -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -m comment --comment "Allow packets for outgoing connections" -j ACCEPT
-A INPUT -s fe80::/10 -d fe80::/10 -m comment --comment "Allow link-local traffic" -j ACCEPT
-A INPUT -s 2001:db8::/64 -m comment --comment "local traffic" -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -m comment --comment "SSH" -j ACCEPT
COMMIT
# Completed on Fri Aug 26 19:57:51 2016

Encrypted ZFS

Before you can use ZFS, you need to install the ZFS tools using apt install zfsutils-linux.

Then, we create a zpool that spans both SSDs:

zpool create \
  -o ashift=12 \
  srv \
  /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0TC06121Z \
  /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5SSNF0TC06787P

The -o ashift=12 ensures proper alignment on disks with a sector size of either 512B or 4KB.

On that zpool, we now create our datasets:

(echo -n on-device-secret && \
 wget -qO - https://autounlock.zekjur.net:8443/nascrypto) | zfs create \
  -o encryption=on \
  -o compression=off \
  -o atime=off \
  -o keyformat=passphrase \
  -o keylocation=file:///dev/stdin \
  srv/data

The key I’m piping into zfs create is constructed from two halves: the on-device secret and the remote secret, which is a setup I’m using to implement an automated crypto unlock that is remotely revokable. See the next section for the corresponding unlock.service.

I repeated this same command (adjusting the dataset name) for each dataset: I currently have one for data and one for backup, just so that the used disk space of each major use case is separately visible:

df -h /srv /srv/backup /srv/data   
Filesystem      Size  Used Avail Use% Mounted on
srv             4,2T  128K  4,2T   1% /srv
srv/backup      8,1T  3,9T  4,2T  49% /srv/backup
srv/data         11T  6,4T  4,2T  61% /srv/data

ZFS maintenance

To detect errors on your disks, ZFS has a feature called “scrubbing”. I don’t think I need to scrub more often than monthly, but maybe your scrubbing requirements are different.

I enabled monthly scrubbing on my zpool srv:

systemctl enable --now zfs-scrub-monthly@srv.timer

On this machine, a scrub takes a little over 4 hours and keeps the disks busy:

  scan: scrub in progress since Wed Oct 11 16:32:05 2023
	808G scanned at 909M/s, 735G issued at 827M/s, 10.2T total
	0B repaired, 7.01% done, 03:21:02 to go

We can confirm by looking at the Prometheus Node Exporter metrics:

screenshot of a Grafana dashboard showing Prometheus Node Exporter metrics

The other maintenance-related setting I changed is to enable automated TRIM:

zpool set autotrim=on srv

Auto Crypto Unlock

To automatically unlock the encrypted datasets at boot, I’m using a custom unlock.service systemd service file.

My unlock.service constructs the crypto key from two halves: the on-device secret and the remote secret that’s downloaded over HTTPS.

This way, my NAS can boot up automatically, but in an emergency I can remotely stop this mechanism.

My unlock.service

[Unit]
Description=unlock hard drive
Wants=network.target
After=systemd-networkd-wait-online.service
Before=samba.service

[Service]
Type=oneshot
RemainAfterExit=yes
# Wait until the host is actually reachable.
ExecStart=/bin/sh -c "c=0; while [ $c -lt 5 ]; do /bin/ping6 -n -c 1 autounlock.zekjur.net && break; c=$((c+1)); sleep 1; done"
ExecStart=/bin/sh -c "(echo -n secret && wget --retry-connrefused -qO - https://autounlock.zekjur.net:8443/nascrypto) | zfs load-key srv/data"
ExecStart=/bin/sh -c "(echo -n secret && wget --retry-connrefused -qO - https://autounlock.zekjur.net:8443/nascrypto) | zfs load-key srv/backup"
ExecStart=/bin/sh -c "zfs mount srv/data"
ExecStart=/bin/sh -c "zfs mount srv/backup"

[Install]
WantedBy=multi-user.target

Backup

For the last 10 years, I have been doing my backups using rsync.

Each machine pushes an incremental backup of its entire root file system (and any mounted file systems that should be backed up, too) to the backup destination (storage2/3).

All the machines I’m backing up run Linux and the ext4 file system. I verified that my backup destination file systems support all the features of the backup source file system that I care about, i.e. extended attributes and POSIX ACLs.

The scheduling of backups is done by “dornröschen”, a Go program that wakes up the backup sources and destination machines and starts the backup by triggering a command via SSH.

SSH configuration

The backup scheduler establishes an SSH connection to the backup source.

On the backup source, I authorized the scheduler like so, meaning it will run /root/backup.pl when connecting:

command="/root/backup.pl",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3Nzainvalidkey backup-scheduler

backup.pl runs rsync, which establishes another SSH connection, this time from the backup source to the backup destination.

On the backup destination (storage2/3), I authorize the backup source’s SSH public key to run rrsync(1) , a script that only permits running rsync in the specified directory:

command="/usr/bin/rrsync /srv/backup/server.zekjur.net",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3Nzainvalidkey server.zekjur.net

Signaling Readiness after Wake-Up

I found it easiest to signal readiness by starting an empty HTTP server gated on After=unlock.service in systemd:

/etc/systemd/system/healthz.service

[Unit]
Description=nginx for /srv health check
Wants=network.target
After=unlock.service
Requires=unlock.service
StartLimitInterval=0

[Service]
Restart=always
# https://itectec.com/unixlinux/restarting-systemd-service-on-dependency-failure/
ExecStartPre=/bin/sh -c 'systemctl is-active docker.service'
# Stay on the same major version in the hope that nginx never decides to break
# the config file syntax (or features) without doing a major version bump.
ExecStartPre=/usr/bin/docker pull nginx:1
ExecStartPre=-/usr/bin/docker kill nginx-healthz
ExecStartPre=-/usr/bin/docker rm -f nginx-healthz
ExecStart=/usr/bin/docker run \
  --name nginx-healthz \
  --publish 10.0.0.253:8200:80 \
  --log-driver=journald \
nginx:1

[Install]
WantedBy=multi-user.target

My wake program then polls that port and returns once the server is up, i.e. the file system has been unlocked and mounted.

Auto Shutdown

Instead of explicitly triggering a shutdown from the scheduler program, I run “dramaqueen”, which shuts down the machine after 10 minutes, but will be inhibited while a backup is running. Optionally, shutting down can be inhibited while there are active samba sessions.

/etc/systemd/system/dramaqueen.service

[Unit]
Description=dramaqueen
After=docker.service
Requires=docker.service

[Service]
Restart=always
StartLimitInterval=0

# Always pull the latest version (bleeding edge).
ExecStartPre=-/usr/bin/docker pull stapelberg/dramaqueen
ExecStartPre=-/usr/bin/docker rm -f dramaqueen
ExecStartPre=/usr/bin/docker create --name dramaqueen stapelberg/dramaqueen
ExecStartPre=/usr/bin/docker cp dramaqueen:/usr/bin/dramaqueen /tmp/
ExecStartPre=/usr/bin/docker rm -f dramaqueen
ExecStart=/tmp/dramaqueen -net_command=

[Install]
WantedBy=multi-user.target

Enabling Wake-on-LAN

Luckily, the network driver of the onboard network card supports WOL by default. If that’s not the case for your network card, see the Arch wiki Wake-on-LAN article.

Conclusion

I have been running a PC-based few-large-disk Network Storage setup for years at this point, and I am very happy with all the properties of the system. I expect to run a very similar setup for years to come.

The low-tech approach to backups of using rsync has worked well — without changes — for years, and I don’t see rsync going away anytime soon.

The upgrade to all-flash is really nice in terms of random access time (for incremental backups) and to eliminate one of the largest sources of noise from my builds.

ZFS seems to work fine so far and is well-integrated into Ubuntu Server.

There are solutions for almost everyone’s NAS needs. This build obviously hits my personal sweet spot, but your needs and preferences might be different!

Here are a couple of related solutions:

If you would like a more integrated solution, you could take a look at the Odroid H3 (Celeron).
If you’re okay with less compute power, but want more power efficiency, you could use an ARM64-based Single Board Computer.
If you want to buy a commercial solution, buy a device from qnap and fill it with SSD disks.
- There are even commercial M.2 flash storage devices like the ASUSTOR Flashstor becoming available! If not for the “off the shelf hardware” goal of my build, this would probably be the most interesting commercial alternative to me.
If you want more compute power, consider a Thin Client (perhaps used) instead of a Single Board Computer.
- ServeTheHome has a nice series called Project TinyMiniMicro (introduction, blog posts)
- If you’re a heise+ subscriber, they have a (German) article about building a NAS from a thin client.
Very similar to thin clients is the Intel NUC (“Next Unit of Computing”): (German) article comparing different NUC 12 devices

With the team surrounding our previous paper on reduced-basis methods for quantum spin systems, Matteo Rizzi, Benjamin Stamm and Stefan Wessel and myself, we recently worked on a follow-up, extending our approach to tensor-network methods. Most of the work was done by Paul Brehmer, a master student in Stefan's group, whom I had the pleasure to co-supervise. Paul did an excellent job in cleaning up and extending the original code we had, which we have now released in open-source form as the ReducedBasis.jl Julia package.

The extension towards tensor-network methods and the integration with libraries such as ITensor.jl following the standard density-matrix renormalisation group (DMRG) approach, finally allows us to treat larger quantum spin systems, closer or at the level of the state of the art. In this work we demonstrate this by a number of different one-dimensional quantum spin-1 models, where our approach allowed us even to identify a few new phases, which have not been studied so far.

The full abstract of our paper reads

Within the reduced basis methods approach, an effective low-dimensional subspace of a quantum many-body Hilbert space is constructed in order to investigate, e.g., the ground-state phase diagram. The basis of this subspace is built from solutions of snapshots, i.e., ground states corresponding to particular and well-chosen parameter values. Here, we show how a greedy strategy to assemble the reduced basis and thus to select the parameter points can be implemented based on matrix-product-states (MPS) calculations. Once the reduced basis has been obtained, observables required for the computation of phase diagrams can be computed with a computational complexity independent of the underlying Hilbert space for any parameter value. We illustrate the efficiency and accuracy of this approach for different one-dimensional quantum spin-1 models, including anisotropic as well as biquadratic exchange interactions, leading to rich quantum phase diagrams.

In the past month I have renovated my appartment. Because of this I had to redo my entire desk setup. If you know me that means spending a lot time managing cables 😅. But I am really happy with the result. See for yourself …

I always wanted to be flexible in how I use the devices on my desk. I want to switch between using my laptop and desktop without having to replug everything. But I also want to be able to use certain devices from both at the same time. I have been using USB Hubs and the like. But I always was left wanting. To be fair my current solution is still not as perfect as in my dreams, but it is damn close.

So lets begin with the easy things. The monitors have multiple inputs, so I just connect those to my desktop and the docking station and voila. Well switching still requires me to use the monitor menus, but that I don’t really need to do that because I set them to “automatic mode” meaning the just show which ever device starts sending data first. And I don’t really need to use all monitors with the laptop when my desktop is running anyway so switching does not happen much.

For the keyboard and mouse I am using the “Logitech MX” keyboard and “Logitech MX Master” mouse. The can be paired with multiple Logitech wireless receivers. The devices can the be switch with the press of a button. Sadly switching one does not switch the other automatically which is still a little annyoing but I have seen some scripts that could be used to automate that as well. Maybe I will give that a shot. I still have a “USB Switch” which is connected to the desktop and laptop, it has a switch to toggle which device is “connected”. I mostly use it for my yubikey now. It also was fine for switching my previous mouse and keyboard.

There is still some room for improvements here, but that is not what has been bugging me. The parts I really wanted to be better are the Speakers, Microphone and the Webcam. In an ideal world they should be accessible on either device or both at the same time. Hence the USB Switch is not a good solution since that only enables operation with a single device at a time. Also using the USB switch is annyoing for other reasons. It means the audio dac is reset when switching devices resulting in an unpleasent noise coming out of my speakers. And also having all devices connected to a switch takes away the ability to attach usb sticks or other devices that i really only need temporarily and daisy chaning usb hubs often results in inconsistent behavior.

What would be a better solution?

Enter everybodies favorite single board computer the Raspberry Pi 🥧. Luckly I still have one lying around since getting one online is next to impossible if you don’t want to pay a scalper an unreasonable amount of money. Hopefully this will change. But anyway how can it help me acomplish my goal.

Thanks to a little something called networking computers can talk to each other. So it should be possible to attach the audio dac and webcam to the pi and then stream the video and audio data to both the laptop and desktop. What do we need to accomplish that.

Configure the Network
Setup Pipewire to run as system service
Enable audio streaming with the pipewire pulse server implementation
Enable laptop and desktop to discover audio devices
Setup USBIP for sharing the webcam

Network Setup

I do not want to share the devices with my entire home network, I just want to share with devices attached to the desk. Since the raspberry only has one network jack and using wireless for streaming data is not a great idea because of increased latency, the first thing I did was setup VLAN that is only availible to the devices on my desk.

First of the network switch needs to support VLANs, there are a lot of switches capable of doing this. They are a little more expensive then unmanged switches, but a basic models are availible starting at around 30€. I opted for a more expensive model from microtik (CSS610-8G-2S+) that is also able to support fibre glas connections instead of just RJ45. Then I configured the switch to setup the home network on each port in untagged mode. Then I created a VLAN with ID 2668 which will only be availible on the ports attached to the raspberry pi, desktop and laptop in tagged mode. The choise of the ID is arbirary, just make sure to not have clashes with other VLAN if you already have a more elaborate network setup at home.

Next the devices need to be configured to know about the VLAN and the IP address range needs to be configured. I like to use systemd-networkd for this. The configuration is done with three files in the /etc/systemd/network directory.

[root@pi ~]# tree /etc/systemd/network
/etc/systemd/network
|-- 0-audio.netdev
|-- 1-audio.network
`-- eth.network

The file 0-audio.netdev defines the VLAN:

[NetDev]
Name=audio
Kind=vlan

[VLAN]
Id=2668

The file eth.network configures the normal home network on the pi, here we need to add a line specifing that the VLAN is availible on this port:

[Match]
Name=eth*

[Network]
DHCP=yes
IPv6PrivacyExtensinos=true
VLAN=audio

Lastly the VLAN network needs to be configured. Since the pi is running continously it is useful to configure its ip statically and setup a DHCP server. All of this is configured with just a few lines in the 1-audio.network file.

[Match]
Name=audio

[Network]
Address=172.16.128.1/24
DHCPServer=true

[DHCPServer]
PoolOffset=100
PoolSize=100
EmitRouter=false

The same steps are used to configure the VLAN on the desktop and laptop, the only things that change are the interface names for the home network and that the audio vlans network can use the configured DHCP server to obtain a lease. Resulting in the follwing 1-audio.network file on the clients.

[Match]
Name=audio

[Network]
DHCP=yes

Of course to use systemd-networkd the service needs to be enabled: systemctl enable --now systemd-networkd.

Pipewire System Service

Pipewire intends to be a modern linux media deamon. It is still in active development. For now it already can be used as a replacement for pulseaudio or jack. Normally pipewire starts when you login to your user session. But since there is no desktop running on the pi pipewire needs to be configured to run as a system service.

First of the software packages need to be installed. I am more of a minimalist when it comes to the systems I configure, means I am running archlinux on the raspberry pi. The packages names might vary if you are running raspbian. For me doning

pacman -S pipewire pipewire-alsa pipewire-jack pipewire-pulse pipewire-zeroconf wireplumber pipewire-docs pipewire-audio realtime

installed all desired packages. There is not a lot of documentation on how to setup pipewire as a system service. I found this issue thread which lists all the steps required. Maybe the process will get simpler in the future, but for now a lot of steps are required.

First a pipewire user and group needs to be created with a statically assigned uid and gid. This is important to correctly set the environment variables in the service files created later. The pipewire user needs to be added to the audio and realtime group.

addgroup --gid 901 pipewire 
adduser --system  --uid 091 --gid 901 pipewire
for g in audio realtime; do sudo adduser pipewire ${g}; done

Next we need to add a configuration file /etc/security/limits.d/99-realtime-privileges.conf to allow the realtime group to change the process priorities to the levels recommended by pipewire.

@realtime - rtprio 98
@realtime - memlock unlimited
@realtime - nice -11

With the limits in place, the next step is to setup systemd units for pipewire, pipewire-pulse and wireplumber. In total 5 files need to be created:

/etc/systemd/system/pipewire.socket
/etc/systemd/system/pipewire.service
/etc/systemd/system/pipewire-pulse.socket
/etc/systemd/system/pipewire-pulse.service
/etc/systemd/system/wireplumber.service

The content of these files is as follows.

#/etc/systemd/system/pipewire.socket
[Unit]
Description=PipeWire Multimedia System Socket

[Socket]
Priority=6
ListenStream=%t/pipewire/pipewire-0
SocketUser=pipewire
SocketGroup=pipewire
SocketMode=0660

[Install]
WantedBy=sockets.target

#/etc/systemd/system/pipewire.service
[Unit]
Description=PipeWire Multimedia Service
Before=gdm.service

# We require pipewire.socket to be active before starting the daemon, because
# while it is possible to use the service without the socket, it is not clear
# why it would be desirable.
#
# Installing pipewire and doing `systemctl start pipewire` will not get the
# socket started, which might be confusing and problematic if the server is to
# be restarted later on, as the client autospawn feature might kick in. Also, a
# start of the socket unit will fail, adding to the confusion.
#
# After=pipewire.socket is not needed, as it is already implicit in the
# socket-service relationship, see systemd.socket(5).
Requires=pipewire.socket

[Service]
User=pipewire
Type=simple
ExecStart=/usr/bin/pipewire
Restart=on-failure
RuntimeDirectory=pipewire
RuntimeDirectoryPreserve=yes
Environment=PIPEWIRE_RUNTIME_DIR=%t/pipewire
# Add if you need debugging
# Environment=PIPEWIRE_DEBUG=4

# These hardcoded runtime and dbus paths must stay this way for a system service
# as the User= is not resolved here 8(
## NOTE we do not change PIPEWIRE_RUNTIME_DIR as this is the system socket dir...
#Environment=PIPEWIRE_RUNTIME_DIR=/run/user/91/pipewire
Environment=XDG_RUNTIME_DIR=/run/user/91
Environment=DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/91/bus

#/etc/systemd/system/pipewire-pulse.socket
[Unit]
Description=PipeWire PulseAudio
Conflicts=pulseaudio.socket

[Socket]
Priority=6
ListenStream=%t/pulse/native
SocketUser=pipewire
SocketGroup=pipewire
SocketMode=0660

[Install]
WantedBy=sockets.target

#/etc/systemd/system/pipewire-pulse.service
[Unit]
Description=PipeWire PulseAudio

# We require pipewire-pulse.socket to be active before starting the daemon, because
# while it is possible to use the service without the socket, it is not clear
# why it would be desirable.
#
# A user installing pipewire and doing `systemctl --user start pipewire-pulse`
# will not get the socket started, which might be confusing and problematic if
# the server is to be restarted later on, as the client autospawn feature
# might kick in. Also, a start of the socket unit will fail, adding to the
# confusion.
#
# After=pipewire-pulse.socket is not needed, as it is already implicit in the
# socket-service relationship, see systemd.socket(5).
Requires=pipewire-pulse.socket
Wants=pipewire.service pipewire-session-manager.service
After=pipewire.service pipewire-session-manager.service
Conflicts=pulseaudio.service
# To ensure that multiple user instances are not created. May not be requiered
Before=gdm.service

[Service]
User=pipewire
Type=simple
ExecStart=/usr/bin/pipewire-pulse
Restart=on-failure
Slice=session.slice

# These hardcoded runtime and dbus paths must stay this way for a system service
# as the User= is not resolved here 8(
Environment=PULSE_RUNTIME_PATH=/home/pipewire
Environment=DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/91/bus

[Install]
Also=pipewire-pulse.socket
WantedBy=multi-user.target

#/etc/systemd/system/wireplumber.service   
[Unit]
Description=Multimedia Service Session Manager
After=pipewire.service
BindsTo=pipewire.service
Conflicts=pipewire-media-session.service

[Service]
User=pipewire

Type=simple
ExecStart=/usr/bin/wireplumber
Restart=on-failure
Slice=session.slice

# These hardcoded runtime and dbus paths must stay this way for a system service
# as the User= is not resolved here 8(
Environment=XDG_RUNTIME_DIR=/run/user/91
Environment=DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/91/bus

[Install]
WantedBy=pipewire.service
Alias=pipewire-session-manager.service

For the services to work correctly we need a running user session with dbus. This can be acomplished by telling loginctl to start a pipewire user session at system boot:

loginctl enable-linger pipewire

Since running pipewire on the pi as a user is undesired the user services need to be masked.

systemctl --user --global mask pipewire.socket pipewire.service pipewire-pulse.socket pipewire-pulse.service wireplumber.service

After this the pipewire system services we just created can be enabled:

systemctl enable --now pipewire.socket pipewire.service pipewire-pulse.socket pipewire-pulse.service wireplumber.service

Configure Pipewire for Network Streaming

At this point piperwire is running on the raspberry after boot up. The next step is to setup network streaming. Thankfully that is easly done in two steps:

Setup Pipewire on the Raspberry Pi to be reachable via the VLAN and enable publishing of its devices via zeroconf
Setup clients (laptop, desktop) to listen for zeroconf announcements

For compatibility with the existing playback methods and to be a “drop-in” replacement pipewire has implementated a full pulseaudio server on top of itself. This way existing tools for managing audio playback and recording can still be used like pavucontrol. Pulseaudio supported being used over a network. This is not low latency so doing this over wifi is not really recommended, but over a wired connection the latencies are so low that it is not noticable. Pipewire supports this as well. So all we need to do to create a configuration file to configure network access:

# /etc/pipewire/pipewire-pulse.conf.d/network.conf 
pulse.properties = {
    # the addresses this server listens on
    pulse.min.frag = 32/48000           #0.5ms
    pulse.default.frag = 256/48000       #5ms 
    pulse.min.quantum = 32/48000        #0.5ms
    server.address = [
        "unix:native"
        #"unix:/tmp/something"              # absolute paths may be used
        #"tcp:4713"                         # IPv4 and IPv6 on all addresses
        #"tcp:[::]:9999"                    # IPv6 on all addresses
        #"tcp:127.0.0.1:8888"               # IPv4 on a single address
        #
        { address = "tcp:172.16.128.1:4713"             # address
          max-clients = 64                 # maximum number of clients
          listen-backlog = 32              # backlog in the server listen queue
          client.access = "allowed"     # permissions for clients
        }
    ]
}

Per default piperwire-pulse only enables the “unix:native” socket for access via dbus. To enable the network streaming the last 4 lines starting with address are of interest. In order to restict access to the VLAN the Ip address of the raspberry pi in the audio network needs to be specified. Also the client.access value needs to be set to “allowed” in order to enable all devices on that network to use it.

I also had to decrease the default values for pulse.min.frag, pulse.default.frag and pulse.min.quantum quite a bit in order for the latency of the mircophone to be usable while in a video call. Otherwise video and audio would be very out of sink. The pipewire documentation warns that this will increase CPU usage. I have not noticed a big impact on the raspberry pi 4 I am using to do this.

Next enabling the publishing of the pipewire server via zeroconf needs to be enabled. This could be done in the same configuration file. But for better overview over the configuration a created an extra configuration file:

# /etc/pipewire/pipewire-pulse.conf.d/publish.conf 
context.exec = [
  { path = "pactl"        args = "load-module module-zeroconf-publish" }
]

Thats really short. All we are doing is to tell the pulseaudio server to enable the zeroconf publish module. And on the clients we need to enable zeroconf discovery like this:

# /etc/pipewire/pipewire-pulse.conf.d/zeroconf-discover.conf 
context.exec = [
  { path = "pactl"        args = "load-module module-zeroconf-discover" }
]

For this to work the zeroconf deamon needs to be running. On linux the zeroconf implementation is provided by avahi. Most systems probably have it running already. On archlinux enable the avaih-daemon via systemd. The daemon also needs to be running on the raspberry pi for the publishing to work.

If everything worked correctly you should see the audio devices attached to the pi pop up in pavucontrol (after restarting the pipewire-pulse service for the configuration to apply):

Selecting the playback device or mircophone phone should now just work like with a locally attached the device. The really nice thing about this is that you can even use the devices from multiple clients at the same time!!!

Webcam

In theory pipewire is was written for camera device sharing between multiple applications. For example the webcam software cheese is already using pipewire. But I have found absolutly zero infromation if it whould be possible to do that via a network. Im not even really sure that this is on the roadmap. If it is I will definitly revisit this topic. The only other option I could think of was to somehow use some form of continous webcam broadcast that I could then somehow attach as a camera, but I also do not want the webcam to be active all the time.

So the solution I have come up with for now is to use USBIP. Which is a client server application to speak the USB protocol via the network. This comes with the drawback that the webcam can only be used by one device at a time, but at least I do not have to physically replug the device. Just issue a command to attach and detach it.

This can be done in a few simple steps:

Install usbip on server (pi) and client (laptop, desktop)
Enable the Service on both devices.
On pi bind webcam to usbip daemon
Attach/detach webcam via usbip daemon on the client

So the first to steps are the same for the client and server: Install the usbip package. Depending on your distribution it might be named differently. The enable the service using systemd: systemctl enable --now usbipd.

The next step is to bind the webcam to the usbipd daemon on the raspberry pi. For this the busid of the device needs to be found. This can be done by using the usbip utility:

$ usbip list -l
 - busid 1-1.1 (08bb:2902)
   Texas Instruments : PCM2902 Audio Codec (08bb:2902)

 - busid 1-1.2 (046d:08b6)
   Logitech, Inc. : unknown product (046d:08b6)

The webcam is the logitech device. Binding it to the daemon is as simple as running:

$ usbip bind -b 1-1.2
usbip: info: bind device on busid 1-1.2: complete

Now the device can be attached to the client. First we can also check that the device is availible to be attached:

$ usbip list -r 172.16.128.1
Exportable USB devices
======================
 - 172.16.128.1
      1-1.2: Logitech, Inc. : unknown product (046d:08b6)
           : /sys/devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/usb1/1-1/1-1.2
           : Miscellaneous Device / ? / Interface Association (ef/02/01)

The -r option is used to specify the remote server running usbip in this case the raspberry pi. Attaching/detaching is done with the commands:

$ sudo usbip attach -r 172.16.128.1 -b 1-1.2
$ sudo usbip detach -p 0

With the webcam attached it can be used like any other webcam. For example you could open cheese and take a picture:

After usage the webcam should be detached again, to make it possible for other clients to connect to it. If you forget to detach before powering of the device currently using the camera. You will login to the pi to unbind and rebind the device again, since usbip does not seem to have a timeout mechanism. A few other things to note about this setup are:

It is still not possible to use the device from multiple clients at the same time 😥
To make sure that the camera can only be used via the local VLAN a firewall configuration on the pi is required, since usbip is not confuriable to only listen on a certain network interface.
If you are getting an error when attaching the camera, you might also need to make sure the vhci-hcd kernel module is loaded!

I hope you enjoyed this post. If you have any further thoughts or questions. Feel free to reach out to me.

Similar to most people in their second PostDoc a considerable chunk of time in the past year has been devoted to job hunting, i.e. writing applications, preparing and attending interviews for junior research group positions. As the year is closing I am finally able to make a positive announcement in this regard: The Swiss ETH board has appointed me as Tenure Track Assistant Professor of Mathematics and of Materials Science and Engineering at EPF Lausanne, a position I am more than happy to take up. From March 2023 I will thus join this school and as part of this interdisciplinary appointment establish a research group located in both the mathematics and materials science institutes.

I am very grateful to the search committee as well as the ETH board and the university for this opportunity to start my own group and to be able to continue my research agenda combining ideas from mathematics and computer science to make materials simulations more robust and efficient. I look forward to becoming a part of the EPFL research environment and being able to contribute to the training of next generation researchers.

Along the lines of this appointment I now also have a few vacancies at PhD and PostDoc level to fill. Further information will be posted here as well as standard channels of the community early next year.

Note: If you don’t want to read the exposition and explanations and just want to know the steps I did, scroll to the summary at the bottom.

For a couple of years I have (with varying degrees of commitment) participated in Advent of Code, a yearly programming competition. It consists of fun little daily challenges. It is great to exercise your coding muscles and can provide opportunity to learn new languages and technologies.

So far I have created a separate repository for each year, with a directory per day. But I decided that I’d prefer to have a single repository, containing my solutions for all years. The main reason is that I tend to write little helpers that I would like to re-use between years.

When merging the repositories it was important to me to preserve the history of the individual years as well, though. I googled around for how to do this and the solutions I found didn’t quite work for me. So I thought I should document my own solution, in case anyone finds it useful.

You can see the result here. As you can see, there are four cleanly disjoint branches with separate histories. They then merge into one single commit.

One neat effect of this is that the merged repository functions as a normal remote for all the four old repositories. It involves no rewrites of history and all the previous commits are preserved exactly as-is. So you can just git pull from this new repository and git will fast-forward the branch.

Step 1: Prepare individual repositories

First I went through all repositories and prepared them. I wanted to have the years in individual directories. In theory, it is possible to use git-filter-repo and similar tooling to automate this step. For larger projects this might be worth it.

I found it simpler to manually make the changes in the individual repositories and commit them. In particular, I did not only need to move the files to the sub directory, I also had to fix up Go module and import paths. Figuring out how to automate that seemed like a chore. But doing it manually is a quick and easy sed command.

You can see an example of that in this commit. While that link points at the final, merged repository, I created the commit in the old repository. You can see that a lot of files simply moved. But some also had additional changes.

You can also see that I left the go.mod in the top-level directory. That was intentional - I want the final repository to share a single module, so that’s where the go.mod belongs.

After this I was left with four repositories, each of which had all the solutions in their own subdirectory, with a go.mod/go.sum file with the shared module path. I tested that all solutions still compile and appeared to work and moved on.

Step 2: Prepare merged repository

The next step is to create a new repository which can reference commits and objects in all the other repos. After all, it needs to contain the individual histories. This is simple by setting the individual repositories as remotes:

$ mkdir ~/src/github.com/Merovius/AdventOfCode
$ cd ~/src/github.com/Merovius/AdventOfCode
$ git init
$ git remote add 2018 ~/src/github.com/Merovius/aoc18
$ git remote add 2020 ~/src/github.com/Merovius/aoc_2020
$ git remote add 2021 ~/src/github.com/Merovius/aoc_2021
$ git remote add 2022 ~/src/github.com/Merovius/aoc_2022
$ git fetch --multiple 2018 2020 2021 2022
$ git branch -a
remotes/2018/master
remotes/2020/main
remotes/2021/main
remotes/2022/main

One thing worth pointing out is that at this point, the merged AdventOfCode repository does not have any branches itself. The only existing branches are remotes/ references. This is relevant because we don’t want our resulting histories to share any common ancestor. And because git behaves slightly differently in an empty repository. A lot of commands operate on HEAD (the “current branch”), so they have special handling if there is no HEAD.

Step 3: Create merge commit

A git commit can have an arbitrary number of “parents”:

If a commit has zero parents, it is the start of the history. This is what happens if you run git commit in a fresh repository.
If a commit has exactly one parent, it is a regular commit. This is what happens when you run git commit normally.
If a parent has more than one parent, it is a merge commit. This is what happens when you use git merge or merge a pull request in the web UI of a git hoster (like GitHub or Gitlab).

Normally merge commits have two parents - one that is the “main” branch and one that is being “merged into”. However, git does not really distinguish between “main” and “merged” branch. And it also allows a branch to have more than two parents.

We want to create a new commit with four parents: The HEADs of our four individual repositories. I expected this to be simple, but:

$ git merge --allow-unrelated-histories remotes/2018/master remotes/2020/main remotes/2021/main remotes/2022/main
fatal: Can merge only exactly one commit into empty head

This command was supposed to create a merge commit with four parents. We have to pass --allow-unrelated-histories, as git otherwise tries to find a common ancestor between the parents and complains if it can’t find any.

But the command is failing. It seems git is unhappy using git merge with multiple parents if we do not have any branch yet.

I suspect the intended path at this point would be to check out one of the branches and then merge the others into that. But that creates merge conflicts and it also felt… asymmetric to me. I did not want to give any of the base repositories preference. So instead I opted for a more brute-force approach: Dropping down to the plumbing layer.

First, I created the merged directory structure:

$ cp -r ~/src/github.com/Merovius/aoc18/* .
$ cp -r ~/src/github.com/Merovius/aoc_2020/* .
$ cp -r ~/src/github.com/Merovius/aoc_2021/* .
$ cp -r ~/src/github.com/Merovius/aoc_2022/* .
$ vim go.mod # fix up the merged list of dependencies
$ go mod tidy
$ git add .

Note: The above does not copy hidden files (like .gitignore). If you do copy hidden files, take care not to copy any .git directories.

At this point the working directory contains the complete directory layout for the merged commit and it is all in the staging area (or “index”). This is where we normally run git commit. Instead we do the equivalent steps manually, allowing us to override the exact contents:

$ TREE=$(git write-tree)
$ COMMIT=$(git commit-tree $TREE \
    -p remotes/2018/master \
    -p remotes/2020/main \
    -p remotes/2021/main \
    -p remotes/2022/main \
    -m "merge history of all years")
$ git branch main $COMMIT

The write-tree command takes the content of the index and writes it to a “Tree Object” and then returns a reference to the Tree it has written.

A Tree is an immutable representation of a directory in git. It (essentially) contains a list of file name and ID pairs, where each ID points either to a “Blob” (an immutable file) or another Tree.

A Commit in git is just a Tree (describing the state of the files in the repository at that commit), a list of parents, a commit message and some meta data (like who created the commit and when).

The commit-tree command is a low-level command to create such a Commit object. We give it the ID of the Tree the Commit should contain and a list of parents (using -p) as well as a message (using -m). It then writes out that Commit to storage and returns its ID.

At this point we have a well-formed Commit, but it is just loosely stored in the repository. We still need a Branch to point at it, so it doesn’t get lost and we have a memorable handle.

You probably used the git branch command before. In the form above, it creates a new branch main (remember: So far our repository had no branches) pointing at the Commit we created.

And that’s it. We can now treat the repository as a normal git repo. All that is left is to publish it:

$ git remote add origin git@github.com:Merovius/AdventOfCode
$ git push --set-upstream origin main

Executive Summary

To summarize the steps I did:

Create commits in each of the old repositories to move files around and fixing anticipated merge conflicts as needed.
Create a pristine new repository without any branches:
```
$ git init merged
$ cd merged
```

Add the old repositories as remotes for the merged repo:

$ git remote add <repo1> /path/to/repo1
$ git fetch repo1
$ git remote add <repo2> /path/to/repo2
$ git fetch repo2
$ # …

Copy files from old repositories into merged repo:

$ cp -r /path/to/repo1/* .
$ cp -r /path/to/repo2/* .
$ # …

Create commit using plumbing commands:

$ git add .
$ TREE=$(git write-tree)
$ COMMIT=$(git commit-tree $TREE \
    -m "merge repositories" \
    -p remotes/repo1/main \
    -p remotes/repo2/main)
$ git branch main $COMMIT

This post serves as a summary for a live code I did at our local hacker space. For the full experience please refer to the recording. Though I probably should warn that the live coding was done in German (and next time I should make sure to increase the font size everywhere for the recording 🙈).

From zero to a working rust project for the raspberry pi. These are the required steps:

Setup Rust Project with cargo
Install Rust Arm + Raspberry Pi Toolchain
Configure Rust Project for cross compilation
Import crate for GPIO Access
Profit 💰

Setting up a Rust Project

The first step is to setup a rust project. This is easily accomplished by using the rust tooling. Using cargo it is possible it initialize a hello world rust project:

> mkdir pi_project
> cd pi_project
> cargo init

This results in the following project structure:

pi_project
├── Cargo.toml
├── .gitignore
└── src
    └── main.rs

Building and running the code is now as simple as running:

> cargo build
> ./target/debug/pi_project
Hello, world!

Looking at the executable we see that the code was build for the x86 Architecture.

> file ./target/debug/pi_project
target/debug/pi_project: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=0461b95d992ecda8488ad610bb1818344c1eeb8d, for GNU/Linux 4.4.0, with debug_info, not stripped

To be able to run this code on the raspberry pi the target architecture needs to change to ARM.

Rust Arm Toolchain Setup

Installing a different target architecture is easy. All that is required is to use rustup. Warning the following list does not mean that your specific pi revision will work, you need to make extra sure to select the correct architecture based on the model of pi you are using! There are differences per revision of the pi.

# for raspberry pi 3/4
> rustup target add aarch64-unknown-linux-gnu
# for raspberry pi 1/zero
> rustup target add arm-unknown-linux-gnueabihf

This allows telling the cargo to generate ARM machine code. This would be all we need if the goal was to write bare metal code. But just running cargo build --target arm-unknown-linux-gnueabihf results in an error. This because we still need a linker and the matching system libraries to be able to interface correctly with the Linux kernel running on the pi.

This problem is solved by installing a raspberry pi toolchain. The toolchain can be downloaded from here. They are compatible with the official “Raspian OS” for the pi. If you are running a different OS on your PI, you may need to look further to find the matching toolchain for your OS.

In this case the pi is running the newest Raspian, which is based on Debian 11:

> wget https://sourceforge.net/projects/raspberry-pi-cross-compilers/files/Raspberry%20Pi%20GCC%20Cross-Compiler%20Toolchains/Bullseye/GCC%2010.3.0/Raspberry%20Pi%201%2C%20Zero/cross-gcc-10.3.0-pi_0-1.tar.gz/download -O toolchain.tar.gz
> tar -xvf toolchain.tar.gz

Configure cross compilation

Now the rust build system needs to be configured to use the toolchain. This is done by placing a config file in the project root:

pi_project
├── .cargo
│   └── config
├── Cargo.lock
├── Cargo.toml
├── .gitignore
└── src
    └── main.rs

The configuration instructs the cargo build system to use the cross compiler gcc as linker and sets the directory where arm system libraries are located.

# content of .cargo/config
[build]
target = "arm-unknown-linux-gnueabihf" #set default target

#for raspberry pi 1/zero
[target.arm-unknown-linux-gnueabihf]
linker = "/home/judge/.toolchains/cross-pi-gcc-10.3.0-0/bin/arm-linux-gnueabihf-gcc"
rustflags = [
    "-C", "link-arg=--sysroot=/home/judge/.toolchains/cross-pi-gcc-10.3.0-0/arm-linux-gnueabihf/libc"
]

#for raspberry pi 3/4
[target.aarch64-unknown-linux-gnu]
linker = "/home/judge/.toolchains/cross-pi-gcc-10.3.0-64/bin/aarch64-linux-gnu-gcc"
rustflags = [
    "-C", "link-arg=--sysroot=/home/judge/.toolchains/cross-pi-gcc-10.3.0-0/aarch64-linux-gnu/libc"
]

This sets the default target of the project to arm-unknown-linux-gnueabihf, now running cargo build results in the following ARM binary being created.

file target/arm-unknown-linux-gnueabihf/debug/pi_project
target/arm-unknown-linux-gnueabihf/debug/pi_project: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.0, with debug_info, not stripped

It can now be copied to the raspberry pi and be executed.

GPIO Access

Until this point the source of the application was not touched. This changes now because just executing

// contents of src/main.rs
fn main() {
    println!("Hello World!");
}

is boring! If we have a raspberry pi it would be much more fun to use it to control some hardware 💪. Thankfully there already is a library that we can use to do just that. rppal enables access to the GPIO pins of the pi. Including the library in the project requires declaring it as a dependency in the Cargo.toml.

[dependencies]
rppal = "0.14.0"

Now we can use the library to make an led blink.

use std::thread;
use std::time::Duration;

use rppal::gpio::Gpio;

// Gpio uses BCM pin numbering. BCM GPIO 23 is tied to physical pin 16.
const GPIO_LED: u8 = 23;

fn main() {
    let gpio = Gpio::new().expect("Unable to access GPIO!");
    let mut pin = gpio.get(GPIO_LED).unwrap().into_output();

    loop {
        pin.toggle();
        thread::sleep(Duration::from_millis(500));
    }
}

And that’s basically it. Now we can use rust to program the raspberry pi to do any task we want. We can even get fancy and use an async runtime to execute many tasks in parallel.

I hope this summary is useful to you and feel free to contact me if you have questions or find this post useful.

Happy coding 🧑‍💻 …

The goal of quantum-chemical calculations is the simulation of materials and molecules. In density-functional theory (DFT) the first step along this line is obtaining the electron density minimising an energy functional. However, since energies and the density are usually not very tractable quantities in an experimental setup, comparison to experiment and scientific intuition also requires the computation of properties. Important properties include the forces (i.e. the energetic change due to a displacement of the structure) polarisabilities (change in dipole moment due to an external electric field) or phonon spectra (which can be measured using infrared spectroscopy). Therefore an efficient and reliable property computation is crucial to make quantum-chemical simulations interpretable and to close the loop back to experimentalists.

In DFT property calculations are done using density-functional perturbation theory (DFPT), which essentially computes the linear response of the electronic structure to the aforementioned changes in external conditions (external field, nuclear displacements etc.). Solving the equations underlying DFPT can become numerically challenging as (especially for metallic systems) the equations are ill-conditioned.

In a collaboration with my former PostDoc advisor Benjamin Stamm and my old group at the CERMICS at École des Ponts, including Eric Cancès, Antoine Levitt, Gaspard Kemlin, we just published an article, where we provide a more mathematical take on DFPT. In our work we provide an extensive review of various practical setups employed in main-stream codes such as ABINIT and QuantumEspresso from a numerical analysis point of view, highlighting the differences and similarities of these approaches. Moreover we develop a novel approach approach to solve the so-called Sternheimer equations (a key component of DFPT), which allows to make better use of the byproducts available in standard SCF schemes (the algorithm used to obtain the DFT ground state). With our approach we show savings up to 40% in the number of matrix-vector products required to solve the response equations. Since these are the most expensive step in DFPT this implies a similar saving in computational cost overall. Naturally our algorithm has been implemented as the default response solver in our DFTK code, starting from version 0.5.9.

Most of this work was done during a two-month visit of Gaspard Kemlin with Benjamin and myself here in Aachen. I think I speak for the both of us when I say that it has been a great pleasure to have Gaspard around, both on a professional as well as a personal level.

The full abstract of the paper reads

Response calculations in density functional theory aim at computing the change in ground-state density induced by an external perturbation. At finite temperature these are usually performed by computing variations of orbitals, which involve the iterative solution of potentially badly-conditioned linear systems, the Sternheimer equations. Since many sets of variations of orbitals yield the same variation of density matrix this involves a choice of gauge. Taking a numerical analysis point of view we present the various gauge choices proposed in the literature in a common framework and study their stability. Beyond existing methods we propose a new approach, based on a Schur complement using extra orbitals from the self-consistent-field calculations, to improve the stability and efficiency of the iterative solution of Sternheimer equations. We show the success of this strategy on nontrivial examples of practical interest, such as Heusler transition metal alloy compounds, where savings of around 40% in the number of required cost-determining Hamiltonian applications have been achieved.

(Cross-post from our report published in the Psi-k blog)

From 20th until 24th June 2022 I co-organised a workshop on the theme of Error control in first-principles modelling at the CECAM Headquarters in Lausanne (workshop website). For one week the workshop unified like-minded researchers from a range of communities, including quantum chemistry, materials sciences, scientific computing and mathematics to jointly discuss the determination of errors in atomistic modelling. The main goal was to obtain a cross-community overview of ongoing work and to establish new links between the disciplines.

Amongst others we discussed topics such as: the determination of errors in observables, which are the result of long molecular dynamics simulations, the reliability and efficiency of numerical procedures and how to go beyond benchmarking or convergence studies via a rigorous mathematical understanding of errors. We further explored interactions with the field of uncertainty quantification to link numerical and modelling errors in electronic structure calculations or to understand error propagation in interatomic potentials via statistical inference.

Organisers

Gabor Csanyi (University of Cambridge)
Genevieve Dusson (CNRS & Université Bourgogne Franche-Comté)
Michael Herbst (RWTH Aachen University)
Youssef Marzouk (Massachusetts Institute of Technology)

Participants

A primary objective of the conference was to facilitate networking and exchange across communities. Thanks to the funds provided by CECAM and Psi-k we managed to get a crowd of 30 researchers, including about 15 junior researchers, to come to Lausanne in person. Moreover we made an effort to enable a virtual participation to the smoothest extent possible. For example we provided a conference-specific Slack space, which grew into a platform for discussion involving both in-person as well as virtual participants during the conference. In this way in total about 70 researchers from 18 countries could participate in the workshop. The full list of participants is available on the workshop website.

Workshop programme

The workshop programme was split between the afternoon sessions, in which we had introductory and topic-specific lectures, as well as the morning sessions, which were focussed on informal discussion and community brainstorming.

Afternoon lectures

Monday June 20th 2022

Uncertainty quantification for atomic-scale machine learning. (Michele Ceriotti, EPFL)
[slides] [recording]
Testing the hell out of DFT codes with virtual oxides. (Stefaan Cottenier, Ghent University)
[slides] [recording]
Prediction uncertainty validation for computational chemists. (Pascal Pernot, Université Paris-Saclay)
[slides] [recording]
Uncertainty driven active learning of interatomic potentials for molecular dynamics (Boris Kozinsky, Harvard University)
[recording]
Interatomic Potentials from First Principles (Christoph Ortner, University of British Columbia)
[slides] [recording]

Tuesday June 21st 2022

Numerical integration in the Brillouin zone (Antoine Levitt, Inria Paris)
[slides] [recording]
Sensitivity analysis for assessing and controlling errors in theoretical spectroscopy and computational biochemistry (Christoph Jacob,
TU Braunschweig)
[slides]
Uncertainty quantification and propagation in multiscale materials modelling (James Kermode, University of Warwick)
[slides] [recording]
Uncertainty Quantification and Active Learning in Atomistic Computations
(Habib Najm, Sandia National Labs)
Nuances in Bayesian estimation and active learning for data-driven interatomic potentials for propagation of uncertainty through molecular dynamics
(Dallas Foster, MIT)
[slides] [recording]

Wednesday June 22nd 2022

The BEEF class of xc functionals (Thomas Bligaard, DTU)
[recording]
A Bayesian Approach to Uncertainty Quantification for Density Functional Theory (Kate Fisher, MIT)
[slides] [recording]
Dielectric response with short-ranged electrostatics (Stephen Cox, Cambridge)
[slides]
Fully guaranteed and computable error bounds for clusters of eigenvalues (Genevieve Dusson, CNRS)
[slides] [recording]
Practical error bounds for properties in plane-wave electronic structure calculations (Gaspard Kemlin, Ecole des Ponts)
[slides] [recording]
The transferability limits of static benchmarks (Thomas Weymuth, ETH)
[slides] [recording]

Thursday June 23rd 2022

An information-theoretic approach to uncertainty quantification in atomistic modelling of crystalline materials (Maciej Buze, Birmingham)
[slides] [recording]
Hyperactive Learning (Cas van der Oord, Cambridge)
[slides] [recording]
Benchmarking under uncertainty (Jonny Proppe, TU Braunschweig)
Model Error Estimation and Uncertainty Quantification of Machine Learning Interatomic Potentials (Khachik Sargsyan, Sandia National Labs)
[slides] [recording]
Committee neural network potentials control generalization errors and enable active learning (Christoph Schran, Cambridge)
[slides] [recording]

Morning discussion sessions

The discussion sessions were centred around broad multi-disciplinary topics to stimulate cross-fertilisation. Key topics were active learning techniques for obtaining interatomic potentials on the fly as well as opportunities to connect numerical and statistical approaches for error estimation.

A central topic of the session on Thursday morning was the development of a common cross-community language and guidelines for error estimation. This included the question how to establish a minimal standard for error control and make the broader community aware of such techniques to ensure published results can be validated and are more reproducible. Initial ideas from this discussion are summarised in a public github repository. With this repository we invite everyone to contribute concrete examples of the error control strategies taken in their research context. In the future we hope to community guidelines for error control in first-principle modelling based on these initial ideas.

Feedback from participants

Overall we received mostly positive feedback about the event. Virtual participants enjoyed the opportunity to interact with in-person participants via the zoom sessions and Slack. For several in-person participants this meeting was the first physical meeting since the pandemic and the ample opportunities for informal interchange we allocated in the programme (discussion sessions, poster sessions, social dinner, boat trip excursion) have been much appreciated.

A challenge was to keep the meeting accessible for both researchers from foreign fields as well as junior participants entering this interdisciplinary field. With respect to the discussion sessions we got several suggestions for improvement in this regard. For example it has been suggested to (i) set and communicate the discussion subject well in advance to allow people to get prepared, (ii) motivate postdocs to coordinate the discussion, which would be responsible to curate material and formulate stimulating research questions and (iii) get these postdocs to start the session with an introductory presentation on open problems.

Conclusions and outlook

During the event it became apparent that the meaning associated to the term “error control” deviates between communities, in particular between mathematicians and application scientists. Not only did this result in a considerable language barrier and some communication problems during the workshop, but it also made communities to appear to move at different paces. On a first look this sometimes made it difficult to see the applicability of research results from another community. But the heterogeneity of participants also offered opportunities to learn from each other's viewpoint: for example during the discussion sessions we actively worked towards obtaining a joint language and cross-community standards for error control. Our initial ideas on this point are available in a public github repository, where we invite everyone to participate via opening issues and pull requests to continue the discussion.

A number of applications under Linux provide a “Browse Files” button that is intended to pull up a file manager in a specific directory. While this is convenient for most users, some might want a little more flexibility, so let’s hook up a terminal emulator to that button instead of a file manager.

First, we need a command that starts a terminal emulator in a specific directory, in my case this will be

foot -D <path to directory>

which will start foot in the specified <path to directory>.

As this button is implemented leveraging the XDG MIME Applications specification, we now need to define a new desktop entry, let’s call it TermFM.desktop, which we place under either ~/.local/share/applications or /usr/local/share/applications, depending on preference. The file using a foot terminal should read

[Desktop Entry]
Type=Application
Name=TermFM
Exec=foot -D %U
MimeType=inode/directory;

where %U will be the placeholder for the path that is handed over by the calling application. The MimeType line is optional, but given that the above terminal command only works for directories anyways, it doesn’t hurt to constrain this desktop file to this file type only.

Afterwards, we need to configure this as the default applications for the file type inode/directory, which we do by adding

inode/directory=TermFM.desktop

to the [Default Applications] section in ~/.config/mimeapps.list. Should this file not yet exist, you can create it to contain

[Default Applications]
inode/directory=TermFM.desktop

Once that is done, you should from now on get your terminal at the according location when you click “Browse Files” in an application supporting this.

Let’s say you want to implement a sorting function in Go. Or perhaps a data structure like a binary search tree, providing ordered access to its elements. Because you want your code to be re-usable and type safe, you want to use type parameters. So you need a way to order user-provided types.

There are multiple methods of doing that, with different trade-offs. Let’s talk about four in particular here:

constraints.Ordered
A method constraint
Taking a comparison function
Comparator types

`constraints.Ordered`

Go 1.18 has a mechanism to constrain a type parameter to all types which have the < operator defined on them. The types which have this operator are exactly all types whose underlying type is string or one of the predeclared integer and float types. So we can write a type set expressing that:

type Integer interface {
  ~int | ~int8 | ~int16 | ~int32 | ~int64 | ~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 | ~uintptr
}

type Float interface {
  ~float32 | ~float64
}

type Ordered interface {
  Integer | Float | ~string
}

Because that’s a fairly common thing to want to do, there is already a package which contains these kinds of type sets.

With this, you can write the signature of your sorting function or the definition of your search tree as:

func Sort[T constraints.Ordered](s []T) {
  // …
}

type SearchTree[T constraints.Ordered] struct {
  // …
}

The main advantage of this is that it works directly with predeclared types and simple types like time.Duration. It also is very clear.

The main disadvantage is that it does not allow composite types like structs. And what if a user wants a different sorting order than the one implied by <? For example if they want to reverse the order or want specialized string collation. A multimedia library might want to sort “The Expanse” under E. And some letters sort differently depending on the language setting.

constraints.Ordered is simple, but it also is inflexible.

Method constraints

We can use method constraints to allow more flexibility. This allows a user to implement whatever sorting order they want as a method on their type.

We can write that constraint like this:

type Lesser[T any] interface {
  // Less returns if the receiver is less than v.
  Less(v T) bool
}

The type parameter is necessary because we have to refer to the receiver type itself in the Less method. This is hopefully clearer when we look at how this is used:

func Sort[T Lesser[T]](s []T) {
  // …
}

func SearchTree[T Lesser[T]](s []T) {
  // …
}

This allows the user of our library to customize the sorting order by defining a new type with a Less method:

type ReverseInt int

func (i ReverseInt) Less(j ReverseInt) bool {
  return j < i // order is reversed
}

The disadvantage of this is that it requires some boiler plate on part of your user. Using a custom sorting order always requires defining a type with a method.

They can’t use your code with predeclared types like int or string but always have to wrap it into a new type.

Likewise if a type already has a natural comparison method but it is not called Less. For example time.Time is naturally sorted by time.Time.Before. For cases like that there needs to be a wrapper to rename the method.

Whenever one of these wrappings happens your user might have to convert back and forth when passing data to or from your code.

It also is a little bit more confusing than constraints.Ordered, as your user has to understand the purpose of the extra type parameter on Lesser.

Passing a comparison function

A simple way to get flexibility is to have the user pass us a function used for comparison directly:

func Sort[T any](s []T, less func(T, T) bool) {
  // …
}

type SearchTree[T any] struct {
  Less func(T, T) bool
  // …
}

func NewSearchTree(less func(T, T) bool) *SearchTree[T] {
  // …
  return &SearchTree[T]{
    Less: less,
    // …
  }
}

This essentially abandons the idea of type constraints altogether. Our code works with any type and we directly pass around the custom behavior as funcs. Type parameters are only used to ensure that the arguments to those funcs are compatible.

The advantage of this is maximum flexibility. Any type which already has a Less method like above can simply be used with this directly by using method expressions. Regardless of how the method is actually named:

func main() {
  a := []time.Time{ /* … */ }
  Sort(a, time.Time.Before)
}

There is also no boilerplate needed to customize sorting behavior:

func main() {
  a := []int{42,23,1337}
  Sort(a, func(i, j int) bool {
    return j < i // reversed order
  })
}

And you can provide helpers for common customizations:

func Reversed[T any](less func(T, T) bool) (greater func(T, T) bool) {
  return func(a, b T) bool { return less(b, a) }
}

This approach is arguably also more correct than the one above because it decouples the type from the comparison used. If I use a SearchTree as a set datatype, there is no real reason why the elements in the set would be specific to the comparison used. It should be “a set of string” not “a set of MyCustomlyOrderedString”. This reflects the fact that with the method constraint, we have to convert back-and-forth when putting things into the container or taking it out again.

The main disadvantage of this approach is that it means you can not have useful zero values. Your SearchTree type needs the Less field to be populated to work. So its zero value can not be used to represent an empty set.

You cannot even lazily initialize it (which is a common trick to make types which need initialization have a useful zero value) because you don’t know what it should be.

Comparator types

There is a way to pass a function “statically”. That is, instead of passing around a func value, we can pass it as a type argument. The way to do that is to attach it as a method to a struct{} type:

import "golang.org/x/exp/slices"

type IntComparator struct{}

func (IntComparator) Less(a, b int) bool {
  return a < b
}

func main() {
  a := []int{42,23,1337}
  less := IntComparator{}.Less // has type func(int, int) bool
  slices.SortFunc(a, less)
}

Based on this, we can devise a mechanism to allow custom comparisons:

// Comparator is a helper type used to compare two T values.
type Comparator[T any] interface {
  ~struct{}
  Less(a, b T) bool
}

func Sort[C Comparator[T], T any](a []T) {
  var c C
  less := c.Less // has type func(T, T) bool
  // …
}

type SearchTree[C Comparator[T], T any] struct {
  // …
}

The ~struct{} constraints any implementation of Comparator[T] to have underlying type struct{}. It is not strictly necessary, but it serves two purposes here:

It makes clear that Comparator[T] itself is not supposed to carry any state. It only exists to have its method called.
It ensures (as much as possible) that the zero value of C is safe to use. In particular, Comparator[T] would be a normal interface type. And it would have a Less method of the right type, so it would implement itself. But a zero Comparator[T] is nil and would always panic, if its method is called.

An implication of this is that it is not possible to have a Comparator[T] which uses an arbitrary func value. The Less method can not rely on having access to a func to call, for this approach to work.

But you can provide other helpers. This can also be used to combine this approach with the above ones:

type LessOperator[T constraints.Ordered] struct{}

func (LessOperator[T]) Less(a, b T) bool {
  return a < b
}

type LessMethod[T Lesser[T]] struct{}

func (LessMethod[T]) Less(a, b T) bool {
  return a.Less(b)
}

type Reversed[C Comparator[T], T any] struct{}

func (Reversed[C, T]) Less(a, b T) bool {
  var c C
  return c.Less(b, a)
}

The advantage of this approach is that it makes the zero value of SearchTree[C, T] useful. For example, a SearchTree[LessOperator[int], int] can be used directly, without extra initialization.

It also carries over the advantage of decoupling the comparison from the element type, which we got from accepting comparison functions.

One disadvantage is that the comparator can never be inferred. It always has to be specified in the instantiation explicitly¹. That’s similar to how we always had to pass a less function explicitly above.

Another disadvantage is that this always requires defining a type for comparisons. Where with the comparison function we could define customizations (like reversing the order) inline with a func literal, this mechanism always requires a method.

Lastly, this is arguably too clever for its own good. Understanding the purpose and idea behind the Comparator type is likely to trip up your users when reading the documentation.

Summary

We are left with these trade-offs:

	`constraints.Ordered`	`Lesser[T]`	`func(T,T) bool`	`Comparator[T]`
Predeclared types	👍	👎	👎	👎
Composite types	👎	👍	👍	👍
Custom order	👎	👍	👍	👍
Reversal helpers	👍	👎	👍	👍
Type boilerplate	👍	👎	👍	👎
Useful zero value	👍	👍	👎	👍
Type inference	👍	👍	👍	👎
Coupled Type/Order	👎	👎	👍	👍
Clarity	👍	🤷²	👍	👎

One thing standing out in this table is that there is no way to both support predeclared types and support user defined types.

It would be great if there was a way to support multiple of these mechanisms using the same code. That is, it would be great if we could write something like

// Ordered is a constraint to allow a type to be sorted.
// If a Less method is present, it has precedent.
type Ordered[T any] interface {
  constraints.Ordered | Lesser[T]
}

Unfortunately, allowing this is harder than one might think.

Until then, you might want to provide multiple APIs to allow your users more flexibility. The standard library currently seems to be converging on providing a constraints.Ordered version and a comparison function version. The latter gets a Func suffix to the name. See the experimental slices package for an example.

Though as we put the Comparator[T] type parameter first, we can infer T from the Comparator. ↩︎
It’s a little bit worse, but probably fine. ↩︎

Go 1.18 added the biggest and probably one of the most requested features of all time to the language: Generics. If you want a comprehensive introduction to the topic, there are many out there and I would personally recommend this talk I gave at the Frankfurt Gopher Meetup.

This blog post is not an introduction to generics, though. It is about this sentence from the spec:

Implementation restriction: A compiler need not report an error if an operand’s type is a type parameter with an empty type set.

As an example, consider this interface:

type C interface {
  int
  M()
}

This constraint can never be satisfied. It says that a type has to be both the predeclared type int and have a method M(). But predeclared types in Go do not have any methods. So there is no type satisfying C and its type set is empty. The compiler accepts it just fine, though. That is what this clause from the spec is about.

This decision might seem strange to you. After all, if a type set is empty, it would be very helpful to report that to the user. They obviously made a mistake - an empty type set can never be used as a constraint. A function using it could never be instantiated.

I want to explain why that sentence is there and also go into a couple of related design decisions of the generics design. I’m trying to be expansive in my explanation, which means that you should not need any special knowledge to understand it. It also means, some of the information might be boring to you - feel free to skip the corresponding sections.

That sentence is in the Go spec because it turns out to be hard to determine if a type set is empty. Hard enough, that the Go team did not want to require an implementation to solve that. Let’s see why.

P vs. NP

When we talk about whether or not a problem is hard, we often group problems into two big classes:

Problems which can be solved reasonably efficiently. This class is called P.
Problems which can be verified reasonably efficiently. This class is called NP.

The first obvious follow up question is “what does ‘reasonably efficient’ mean?”. The answer to that is “there is an algorithm with a running time polynomial in its input size”¹.

The second obvious follow up question is “what’s the difference between ‘solving’ and ‘verifying’?”.

Solving a problem means what you think it means: Finding a solution. If I give you a number and ask you to solve the factorization problem, I’m asking you to find a (non-trivial) factor of that number.

Verifying a problem means that I give you a solution and I’m asking you if the solution is correct. For the factorization problem, I’d give you two numbers and ask you to verify that the second is a factor of the first.

These two things are often very different in difficulty. If I ask you to give me a factor of 297863737, you probably know no better way than to sit down and try to divide it by a lot of numbers and see if it comes out evenly. But if I ask you to verify that 9883 is a factor of that number, you just have to do a bit of long division and it either divides it, or it does not.

It turns out, that every problem which is efficiently solvable is also efficiently verifiable. You can just calculate the solution and compare it to the given one. So every problem in P is also in NP². But it is a famously open question whether the opposite is true - that is, we don’t really know, if there are problems which are hard to solve but easy to verify.

This is hard to know in general. Because us not having found an efficient algorithm to solve a problem does not mean there is none. But in practice we usually assume that there are some problems like that.

One fact that helps us talk about hard problems, is that there are some problems which are as hard as possible in NP. That means we were able to prove that if you can solve one of these problems you can use that to solve any other problem in NP. These problems are called “NP-complete”.

That is, to be frank, plain magic and explaining it is far beyond my capabilities. But it helps us to tell if a given problem is hard, by doing it the other way around. If solving problem X would enable us to solve one of these NP-complete problems then solving problem X is obviously itself NP-complete and therefore probably very hard. This is called a “proof by reduction”.

One example of such problem is boolean satisfiability. And it is used very often to prove a problem is hard.

SAT

Imagine I give you a boolean function. The function has a bunch of bool arguments and returns bool, by joining its arguments with logical operators into a single expression. For example:

func F(x, y, z bool) bool {
  return ((!x && y) || z) && (x || !y)
}

If I give you values for these arguments, you can efficiently tell me if the formula evaluates to true or false. You just substitute them in and evaluate every operator. For example

f(true, true, false)
  → ((!true && true) || false) && (true || !true)
  → ((false && true) || false) && (true || !true)
  → ((false && true) || false) && (true || false)
  → ((false && true) || false) && true
  →  (false && true) || false
  →   false && true
  →   false

This takes at most one step per operator in the expression. So it takes a linear number of steps in the length of the input, which is very efficient.

But if I only give you the function and ask you to find arguments which make it return true - or even to find out whether such arguments exist - you probably have to try out all possible input combinations to see if any of them does. That’s easy for three arguments. But for $n$ arguments there are $2^n$ possible assignments, so it takes exponential time in the number of arguments.

The problem of finding arguments that makes such a function return true (or proving that no such arguments exists) is called “boolean satisfiability” and it is NP-complete.

It is extremely important in what form the expression is given, though. Some forms make it pretty easy to solve, while others make it hard.

For example, every expression can be rewritten into what is called a “Disjunctive Normal Form” (DNF). It is called that because it consists of a series of conjunction (&&) terms, joined together by disjunction (||) operators³:

func F_DNF(x, y, z bool) bool {
  return (x && z) || (!y && z)
}

(You can verify that this is the same function as above, by trying out all 8 input combinations)

Each term has a subset of the arguments, possibly negated, joined by &&. The terms are then joined together using ||.

Solving the satisfiability problem for an expression in DNF is easy:

Go through the individual terms. || is true if and only if either of its operands is true. So for each term:
- If it contains both an argument and its negation (x && !x) it can never be true. Continue to the next term.
- Otherwise, you can infer valid arguments from the term:
  - If it contains x, then we must pass true for x
  - If it contains !x, then we must pass false for x
  - If it contains neither, then what we pass for x does not matter and either value works.
- The term then evaluates to true with these arguments, so the entire expression does.
If none of the terms can be made true, the function can never return true and there is no valid set of arguments.

On the other hand, there is also a “Conjunctive Normal Form” (CNF). Here, the expression is a series of disjunction (||) terms, joined together with conjunction (&&) operators:

func F_CNF(x, y, z bool) bool {
  return (!x || z) && (y || z) && (x || !y)
}

(Again, you can verify that this is the same function)

For this, the idea of our algorithm does not work. To find a solution, you have to take all terms into account simultaneously. You can’t just tackle them one by one. In fact, solving satisfiability on CNF (often abbreviated as “CNFSAT”) is NP-complete⁴.

It turns out that every boolean function can be written as a single expression using only ||, && and !. In particular, every boolean function has a DNF and a CNF.

Very often, when we want to prove a problem is hard, we do so by reducing CNFSAT to it. That’s what we will do for the problem of calculating type sets. But there is one more preamble we need.

Sets and Satisfiability

There is an important relationship between sets and boolean functions.

Say we have a type T and a Universe which contains all possible values of T. If we have a func(T) bool, we can create a set from that, by looking at all objects for which the function returns true:

var Universe Set[T]

func MakeSet(f func(T) bool) Set[T] {
  s := make(Set[T])
  for v := range Universe {
    if f(v) {
      s.Add(v)
    }
  }
  return s
}

This set contains exactly all elements for which f is true. So calculating f(v) is equivalent to checking s.Contains(v). And checking if s is empty is equivalent to checking if f can ever return true.

We can also go the other way around:

func MakeFunc(s Set[T]) func(T) bool {
  return func(v T) bool {
    return s.Contains(v)
  }
}

So in a sense func(T) bool and Set[T] are “the same thing”. We can transform a question about one into a question about the other and back.

As we observed above it is important how a boolean function is given. To take that into account we have to also convert boolean operators into set operations:

// Union(s, t) contains all elements which are in s *or* in t.
func Union(s, t Set[T]) Set[T] {
  return MakeSet(func(v T) bool {
    return s.Contains(v) || t.Contains(v)
  })
}

// Intersect(s, t) contains all elements which are in s *and* in t.
func Intersect(s, t Set[T]) Set[T] {
  return MakeSet(func(v T) bool {
    return s.Contains(v) && t.Contains(v)
  })
}

// Complement(s) contains all elements which are *not* in s.
func Complement(s Set[T]) Set[T] {
  return MakeSet(func(v T) bool {
    return !s.Contains(v)
  })
}

And back:

// Or creates a function which returns if f or g is true.
func Or(f, g func(T) bool) func(T) bool {
  return MakeFunc(Union(MakeSet(f), MakeSet(g)))
}

// And creates a function which returns if f and g are true.
func And(f, g func(T) bool) func(T) bool {
  return MakeFunc(Intersect(MakeSet(f), MakeSet(g)))
}

// Not creates a function which returns if f is false
func Not(f func(T) bool) func(T) bool {
  return MakeFunc(Complement(MakeSet(f)))
}

The takeaway from all of this is that constructing a set using Union, Intersect and Complement is really the same as writing a boolean function using ||, && and !.

And proving that a set constructed in this way is empty is the same as proving that a corresponding boolean function is never true.

And because checking that a boolean function is never true is NP-complete, so is checking if one of the sets constructed like this.

With this, let us look at the specific sets we are interested in.

Basic interfaces as type sets

Interfaces in Go are used to describe sets of types. For example, the interface

type S interface {
    X()
    Y()
    Z()
}

is “the set of all types which have a method X() and a method Y() and a method Z()”.

We can also express set intersection, using interface embedding:

type S interface { X() }
type T interface { Y() }
type U interface {
    S
    T
}

This expresses the intersection of S and T as an interface. Or we can view the property “has a method X()” as a boolean variable and think of this as the formula x && y.

Surprisingly, there is also a limited form of negation. It happens implicitly, because a type can not have two different methods with the same name. Implicitly, if a type has a method X() it does not have a method X() int for example:

type X interface { X() }
type NotX interface{ X() int }

There is a small snag: A type can have neither a method X() nor have a method X() int. That’s why our negation operator is limited. Real boolean variables are always either true or false, whereas our negation also allows them to be neither. In mathematics we say that this logic language lacks the law of the excluded middle (also called “Tertium Non Datur” - “there is no third”). For this section, that does not matter. But we have to worry about it later.

Because we have intersection and negation, we can express interfaces which could never be satisfied by any type (i.e. which describe an empty type set):

interface{ X; NotX }

The compiler rejects such interfaces. But how can it do that? Did we not say above that checking if a set is empty is NP-complete?

The reason this works is that we only have negation and conjunction (&&). So all the boolean expressions we can build with this language have the form

x && y && !z

These expressions are in DNF! We have a term, which contains a couple of variables - possibly negated - and joins them together using &&. We don’t have ||, so there is only a single term.

Solving satisfiability in DNF is easy, as we said. So with the language as we have described it so far, we can only express type sets which are easy to check for emptiness.

Adding unions

Go 1.18 extends the interface syntax. For our purposes, the important addition is the | operator:

type S interface{
    A | B
}

This represents the set of all types which are in the union of the type sets A and B - that is, it is the set of all types which are in A or in B (or both).

This means our language of expressible formulas now also includes a ||-operator - we have added set unions and set unions are equivalent to || in the language of formulas. What’s more, the form of our formula is now a conjunctive normal form - every line is a term of || and the lines are connected by &&:

type X interface { X() }
type NotX interface{ X() int }
type Y interface { Y() }
type NotY interface{ Y() int }
type Z interface { Z() }
type NotZ interface{ Z() int }

// (!x || z) && (y || z) && (x || !y)
type S interface {
    NotX | Z
    Y | Z
    X | NotY
}

This is not quite enough to prove NP-completeness though, because of the snag above. If we want to prove that it is easy, it does not matter that a type can have neither method. But if we want to prove that it is hard, we really need an exact equivalence between boolean functions and type sets. So we need to guarantee that a type has one of our two contradictory methods.

“Luckily”, the | operator gives us a way to fix that:

type TertiumNonDatur interface {
    X | NotX
    Y | NotY
    Z | NotZ
}

// (!x || z) && (y || z) && (x || !y)
type S interface {
    TertiumNonDatur

    NotX | Z
    Y | Z
    X | NotY
}

Now any type which could possibly implement S must have either an X() or an X() int method, because it must implement TertiumNonDatur as well. So this extra interface helps us to get the law of the excluded middle into our language of type sets.

With this, checking if a type set is empty is in general as hard as checking if an arbitrary boolean formula in CNF has no solution. As described above, that is NP-complete.

Even worse, we want to define which operations are allowed on a type parameter by saying that it is allowed if every type in a type set supports it. However, that check is also NP-complete.

The easy way to prove that is to observe that if a type set is empty, every operator should be allowed on a type parameter constrained by it. Because any statement about “every element of the empty set“ is true⁵.

But this would mean that type-checking a generic function would be NP-complete. If an operator is used, we have to at least check if the type set of its constraint is empty. Which is NP-complete.

Why do we care?

A fair question is “why do we even care? Surely these cases are super exotic. In any real program, checking this is trivial”.

That’s true, but there are still reasons to care:

Go has the goal of having a fast compiler. And importantly, one which is guaranteed to be fast for any program. If I give you a Go program, you can be reasonably sure that it compiles quickly, in a time frame predictable by the size of the input.

If I can craft a program which compiles slowly - and may take longer than the lifetime of the universe - this is no longer true.

This is especially important for environments like the Go playground, which regularly compiles untrusted code.
NP complete problems are notoriously hard to debug if they fail.

If you use Linux, you might have occasionally run into a problem where you accidentally tried installing conflicting versions of some package. And if so, you might have noticed that your computer first chugged along for a while and then gave you an unhelpful error message about the conflict. And maybe you had trouble figuring out which packages declared the conflicting dependencies.

This is typical for NP complete problems. As an exact solution is often too hard to compute, they rely on heuristics and randomization and it’s hard to work backwards from a failure.
We generally don’t want the correctness of a Go program to depend on the compiler used. That is, a program should not suddenly stop compiling because you used a different compiler or the compiler was updated to a new Go version.

But NP-complete problems don’t allow us to calculate an exact solution. They always need some heuristic (even if it is just “give up after a bit”). If we don’t want the correctness of a program to be implementation defined, that heuristic must become part of the Go language specification. But these heuristics are very complex to describe. So we would have to spend a lot of room in the spec for something which does not give us a very large benefit.

Note that Go also decided to restrict the version constraints a go.mod file can express, for exactly the same reasons. Go has a clear priority, not to require too complicated algorithms in its compilers and tooling. Not because they are hard to implement, but because the behavior of complicated algorithms also tends to be hard to understand for humans.

So requiring to solve an NP-complete problem is out of the question.

The fix

Given that there must not be an NP-complete problem in the language specification and given that Go 1.18 was released, this problem must have somehow been solved.

What changed is that the language for describing interfaces was limited from what I described above. Specifically

Implementation restriction: A union (with more than one term) cannot contain the predeclared identifier comparable or interfaces that specify methods, or embed comparable or interfaces that specify methods.

This disallows the main mechanism we used to map formulas to interfaces above. We can no longer express our TertiumNonDatur type, or the individual | terms of the formula, as the respective terms specify methods. Without specifying methods, we can’t get our “implicit negation” to work either.

The hope is that this change (among a couple of others) is sufficient to ensure that we can always calculate type sets accurately. Which means I pulled a bit of a bait-and-switch: I said that calculating type sets is hard. But as they were actually released, they might not be.

The reason I wrote this blog post anyways is to explain the kind of problems that exist in this area. It is easy to say we have solved this problem once and for all.

But to be certain, someone should prove this - either by writing a proof that the problem is still hard or by writing an algorithm which solves it efficiently.

There are also still discussions about changing the generics design. As one example, the limitations we introduced to fix all of this made one of the use cases from the design doc impossible to express. We might want to tweak the design to allow this use case. We have to look out in these discussions, so we don’t re-introduce NP-completeness. It took us some time to even detect it when the union operator was proposed.

And there are other kinds of “implicit negations” in the Go language. For example, a struct can not have both a field and a method with the same name. Or being one type implies not being another type (so interface{int} implicitly negates interface{string}).

All of which is to say that even if the problem might no longer be NP-complete - I hope that I convinced you it is still more complicated than you might have thought.

If you want to discuss this further, you can find links to my social media on the bottom of this site.

I want to thank my beta-readers for helping me improve this article. Namely arnehormann, @johanbrandhorst, @mvdan_, @_myitcv, @readcodesing, @rogpeppe and @zekjur.

They took a frankly unreasonable chunk of time out of their day. And their suggestions were invaluable.

It should be pointed out, though, that “polynomial” can still be extremely inefficient. $n^{1000}$ still grows extremely fast, but is polynomial. And for many practical problems, even $n^3$ is intolerably slow. But for complicated reasons, there is a qualitatively important difference between “polynomial” and “exponential”⁶ run time. So you just have to trust me that the distinction makes sense. ↩︎
These names might seem strange, by the way. P is easy to explain: It stands for “polynomial”.

NP doesn’t mean “not polynomial” though. It means “non-deterministic polynomial”. A non-deterministic computer, in this context, is a hypothetical machine which can run arbitrarily many computations simultaneously. A program which can be verified efficiently by any computer can be solved efficiently by a non-deterministic one. It just tries out all possible solutions at the same time and returns a correct one.

Thus, being able to verify a problem on a normal computer means being able to solve it on a non-deterministic one. That is why the two definitions of NP “verifiable by a classical computer” and “solvable by a non-deterministic computer” mean the same thing. ↩︎
You might complain that it is hard to remember if the “disjunctive normal form” is a disjunction of conjunctions, or a conjunction of disjunctions - and that no one can remember which of these means && and which means || anyways.

You would be correct. ↩︎
You might wonder why we can’t just solve CNFSAT by transforming the formula into DNF and solving that.

The answer is that the transformation can make the formula exponentially larger. So even though solving the problem on DNF is linear in the size the DNF formula, that size is exponential in the size of the CNF formula. So we still use exponential time in the size of the CNF formula. ↩︎
This is called the principle of explosion or “ex falso quodlibet” (“from falsehoold follows anything”).

Many people - including many first year math students - have anxieties and confusion around this principle and feel that it makes no sense. So I have little hope that I can make it palatable to you. But it is extremely important for mathematics to “work” and it really is the most reasonable way to set things up.

Sorry. ↩︎
Yes, I know that there are complexity classes between polynomial and exponential. Allow me the simplification. ↩︎

Last Thursday and Friday (17/18 February) I taught an introductory course to the Julia programming language. The course took place in virtual format and to my great surprise around 90 people from all over the world ended up joining. Luckily I had a small support team consisting of Gaspard Kemlin and Lambert Theissen (thanks!) who took care of some of the organisational aspects in running the zoom session. Overall it was a lot of fun to spread the word about the Julia programming language with so many curious listeners with interested and supporting questions.

Thanks to everyone who tuned in and thanks to everyone who gave constructive feedback at the end. I'm very much encouraged by the fact that all of you, unanimously, would recommend the workshop to your peers. In that sense: Please go spread the word as I'm already looking forward to the next occasion I'll have to teach about Julia!

Planet NoName e.V.

2025-07-14

2025-07-13

Context/History

My NAS Software Requirements

Why migrate from CoreOS/Flatcar to NixOS?

R1. cloud-init is deprecated

R2. Container Bitrot

R3. Dependency on a central service

R4. Could not try Immich on Flatcar

Reason Summary

Prototyping in a VM

Migrating

M1. Install NixOS

M2. Set up remote disk unlock

M3. Set up Samba for access

M4. Set up SSH/rsync for backups

Nice-to-haves

N1. Prometheus Node Exporter

N2. Reliable mounting

N3. nginx-healthz

N4. NixOS Jellyfin

N5. NixOS samba

N6. NixOS rrsync

N7. sync.pl script

N8. Sharing configs

N9. Trying immich!

Conclusion

2025-06-24

2025-06-01

Introduction: Declarative?

Ways of installing NixOS

Graphical Installer: Only for Desktops

Manual Installation

Network Installation: nixos-anywhere

Setup: Installing Nix

Building your own installer

Enabling Nix Flakes

(Re-)Installation Steps

Post-Installation Steps

Making Changes

Conclusion

2025-05-15

Components

Case

Power Supply

SSD disk

Onboard 2.5GbE Network Card

Mainboard

CPU fan

CPU and GPU: Idle Power vs. Peak Performance

CPU choice: Intel over AMD

Graphics card: nvidia over AMD

Why Low Idle Power is so important

Installation

UEFI setup

Software setup: early adopter pains

TRIM your SSDs

Performance

Stability issues

Stress testing

RMA timeline

Conclusion

2025-05-10

Automatic monitor configuration with grobi

Why not autorandr?

Does grobi work on Wayland?

Bonus: my Suspend-to-RAM setup

2025-03-19

Components

Case

Power Supply

SSD disk

Mainboard

CPU fan

CPU

Graphics card

Installation

UEFI setup

Software setup: early adopter pains