# Planet NoName e.V.

## 2022-08-05

### RaumZeitLabor

#### Tag des offenen Hackerspaces – All Creatures Welcome!

Am Samstag, den 27. August 2022, lädt der Chaos Computer Club zum ersten Tag des offenen Hackerspaces ein. Zusammen mit über 50 Hackerspaces in Deutschland, der Schweiz, Luxemburg und weiteren Ländern werden wir an diesem Tag einen Einblick in unsere Räumlichkeiten und unsere Arbeit geben.

Ab 16 Uhr ist das RZL offiziell für interessierte Erst- oder auch Wieder-Besuchende geöffnet. Lasst euch unsere Holzwerkstatt, unsere 3D-Drucker, die Lasercutter, Stick- und andere Maschinen zeigen, oder legt direkt selbst Hand an. Ihr könnt Stofftaschen und Shirts mit Hilfe unseres Schneidplotters aufhübschen oder kleine Bausätze in unserer Elektronik-Ecke zusammenlöten. Für das leibliche Wohl wird selbstverständlich auch gesorgt sein – kühle Mate, heiße Waffeln und diverses anderes Herzhaftes und Süßes gibt’s in unserer Küche.

Der Tag des offenen Hackerspaces soll dazu dienen, Hackerspaces als das zu zeigen, was sie sind: Offene Orte für den kreativen Umgang mit Technik und als Raum in dem sich Hacker:innen, Maker:innen und Bastler:innen treffen, um sich auszutauschen und gemeinsam an Projekten zu arbeiten und Neues zu lernen.

Wir freuen uns auf euer Kommen – Bitte beachtet unser Hygienekonzept wenn ihr vorbeischauen wollt!

## 2022-07-16

### michael-herbst.com

#### CECAM flagship workshop: Error control in first-principles modelling

(Cross-post from our report published in the Psi-k blog)

From 20th until 24th June 2022 I co-organised a workshop on the theme of Error control in first-principles modelling at the CECAM Headquarters in Lausanne (workshop website). For one week the workshop unified like-minded researchers from a range of communities, including quantum chemistry, materials sciences, scientific computing and mathematics to jointly discuss the determination of errors in atomistic modelling. The main goal was to obtain a cross-community overview of ongoing work and to establish new links between the disciplines.

Amongst others we discussed topics such as: the determination of errors in observables, which are the result of long molecular dynamics simulations, the reliability and efficiency of numerical procedures and how to go beyond benchmarking or convergence studies via a rigorous mathematical understanding of errors. We further explored interactions with the field of uncertainty quantification to link numerical and modelling errors in electronic structure calculations or to understand error propagation in interatomic potentials via statistical inference.

## Participants

A primary objective of the conference was to facilitate networking and exchange across communities. Thanks to the funds provided by CECAM and Psi-k we managed to get a crowd of 30 researchers, including about 15 junior researchers, to come to Lausanne in person. Moreover we made an effort to enable a virtual participation to the smoothest extent possible. For example we provided a conference-specific Slack space, which grew into a platform for discussion involving both in-person as well as virtual participants during the conference. In this way in total about 70 researchers from 18 countries could participate in the workshop. The full list of participants is available on the workshop website.

## Workshop programme

The workshop programme was split between the afternoon sessions, in which we had introductory and topic-specific lectures, as well as the morning sessions, which were focussed on informal discussion and community brainstorming.

### Afternoon lectures

##### Monday June 20th 2022
• Uncertainty quantification for atomic-scale machine learning. (Michele Ceriotti, EPFL)
[slides] [recording]
• Testing the hell out of DFT codes with virtual oxides. (Stefaan Cottenier, Ghent University)
[slides] [recording]
• Prediction uncertainty validation for computational chemists. (Pascal Pernot, Université Paris-Saclay)
[slides] [recording]
• Uncertainty driven active learning of interatomic potentials for molecular dynamics (Boris Kozinsky, Harvard University)
[recording]
• Interatomic Potentials from First Principles (Christoph Ortner, University of British Columbia)
[slides] [recording]
##### Tuesday June 21st 2022
• Numerical integration in the Brillouin zone (Antoine Levitt, Inria Paris)
[slides] [recording]
• Sensitivity analysis for assessing and controlling errors in theoretical spectroscopy and computational biochemistry (Christoph Jacob,
TU Braunschweig)
[slides]
• Uncertainty quantification and propagation in multiscale materials modelling (James Kermode, University of Warwick)
[slides] [recording]
• Uncertainty Quantification and Active Learning in Atomistic Computations
(Habib Najm, Sandia National Labs)
• Nuances in Bayesian estimation and active learning for data-driven interatomic potentials for propagation of uncertainty through molecular dynamics
(Dallas Foster, MIT)
[slides] [recording]
##### Wednesday June 22nd 2022
• The BEEF class of xc functionals (Thomas Bligaard, DTU)
[recording]
• A Bayesian Approach to Uncertainty Quantification for Density Functional Theory (Kate Fisher, MIT)
[slides] [recording]
• Dielectric response with short-ranged electrostatics (Stephen Cox, Cambridge)
[slides]
• Fully guaranteed and computable error bounds for clusters of eigenvalues (Genevieve Dusson, CNRS)
[slides] [recording]
• Practical error bounds for properties in plane-wave electronic structure calculations (Gaspard Kemlin, Ecole des Ponts)
[slides] [recording]
• The transferability limits of static benchmarks (Thomas Weymuth, ETH)
[slides] [recording]
##### Thursday June 23rd 2022
• An information-theoretic approach to uncertainty quantification in atomistic modelling of crystalline materials (Maciej Buze, Birmingham)
[slides] [recording]
• Hyperactive Learning (Cas van der Oord, Cambridge)
[slides] [recording]
• Benchmarking under uncertainty (Jonny Proppe, TU Braunschweig)
• Model Error Estimation and Uncertainty Quantification of Machine Learning Interatomic Potentials (Khachik Sargsyan, Sandia National Labs)
[slides] [recording]
• Committee neural network potentials control generalization errors and enable active learning (Christoph Schran, Cambridge)
[slides] [recording]

### Morning discussion sessions

The discussion sessions were centred around broad multi-disciplinary topics to stimulate cross-fertilisation. Key topics were active learning techniques for obtaining interatomic potentials on the fly as well as opportunities to connect numerical and statistical approaches for error estimation.

A central topic of the session on Thursday morning was the development of a common cross-community language and guidelines for error estimation. This included the question how to establish a minimal standard for error control and make the broader community aware of such techniques to ensure published results can be validated and are more reproducible. Initial ideas from this discussion are summarised in a public github repository. With this repository we invite everyone to contribute concrete examples of the error control strategies taken in their research context. In the future we hope to community guidelines for error control in first-principle modelling based on these initial ideas.

## Feedback from participants

Overall we received mostly positive feedback about the event. Virtual participants enjoyed the opportunity to interact with in-person participants via the zoom sessions and Slack. For several in-person participants this meeting was the first physical meeting since the pandemic and the ample opportunities for informal interchange we allocated in the programme (discussion sessions, poster sessions, social dinner, boat trip excursion) have been much appreciated.

A challenge was to keep the meeting accessible for both researchers from foreign fields as well as junior participants entering this interdisciplinary field. With respect to the discussion sessions we got several suggestions for improvement in this regard. For example it has been suggested to (i) set and communicate the discussion subject well in advance to allow people to get prepared, (ii) motivate postdocs to coordinate the discussion, which would be responsible to curate material and formulate stimulating research questions and (iii) get these postdocs to start the session with an introductory presentation on open problems.

## Conclusions and outlook

During the event it became apparent that the meaning associated to the term “error control” deviates between communities, in particular between mathematicians and application scientists. Not only did this result in a considerable language barrier and some communication problems during the workshop, but it also made communities to appear to move at different paces. On a first look this sometimes made it difficult to see the applicability of research results from another community. But the heterogeneity of participants also offered opportunities to learn from each other's viewpoint: for example during the discussion sessions we actively worked towards obtaining a joint language and cross-community standards for error control. Our initial ideas on this point are available in a public github repository, where we invite everyone to participate via opening issues and pull requests to continue the discussion.

## 2022-07-12

### RaumZeitLabor

#### RaumZeitLabastel-Nachmittabend

Liebe RaumZeitLabastler:innen!

Am Samstag, den 30.07. wollen wir ab 15 Uhr die digitalen Hilfsmittel möglichst weit links liegen lassen und Dinge mit bloßen Händen kreieren.

Falls ihr seit Anbeginn der Zeit Art-Attack-Anbetende oder World of Woolcrafter:innen seid, ihr mal kultigen Kartoffeldruck und verschiedene Malware auf Mischpapier testen oder euch beim Swing Scrapbooking gegenseitig die Schreibhefte bekleben wollt – lasst eurem Einfallsreichtum an diesem Nachmittabend freien Lauf! Bringt also auf jeden Fall auch eure Materialien und niemals fertig werdenden Offline-Projekte von daheim mit. Für Einfallslose und Kreativblockierte wird es ein mit Ideen gefülltes “Inspirations-Glas” geben.

Faltet mir bitte bis zum 23.07. einen elektronischen Origami-Anmeldegruß, wenn ihr dabei sein möchtet. Die Vereinskasse freut sich über einen Material-Unkostenbeitrag von mindestens 5 Euro pro Bastelnase.

Künstlerische Grüße
eure flederARTie

P.S.: Dies ist eine öffentliche Veranstaltung.
Bitte beachtet das Hygienekonzept!

## 2022-07-09

### Insanity Industries

#### Hooking a terminal up to "Browse Files"

A number of applications under Linux provide a “Browse Files” button that is intended to pull up a file manager in a specific directory. While this is convenient for most users, some might want a little more flexibility, so let’s hook up a terminal emulator to that button instead of a file manager.

First, we need a command that starts a terminal emulator in a specific directory, in my case this will be

foot -D <path to directory>


which will start foot in the specified <path to directory>.

As this button is implemented leveraging the XDG MIME Applications specification, we now need to define a new desktop entry, let’s call it TermFM.desktop, which we place under either ~/.local/share/applications or /usr/local/share/applications, depending on preference. The file should read

[Desktop Entry]
Type=Application
Name=TermFM
Exec=foot -D %U
MimeType=inode/directory;


where %U will be the placeholder for the path that is handed over by the calling application. The MimeType line is optional, but given that the above terminal command only works for directories anyways, it doesn’t hurt to constrain this desktop file to this file type only.

Afterwards, we need to configure this as the default applications for the file type inode/directory, which we do by adding

inode/directory=TermFM.desktop


to the [Default Applications] section in ~/.config/mimeapps.list. Should this file not yet exist, you can create it to contain

[Default Applications]
inode/directory=TermFM.desktop


Once that is done, you should from now on get your terminal at the according location when you click “Browse Files” in an application supporting this.

## 2022-07-02

### sECuREs website

#### rsync, article 3: How does rsync work?

This post is the third article in a series of blog posts about rsync, see the Series Overview.

With rsync up and running, it’s time to take a peek under the hood of rsync to better understand how it works.

## How does rsync work?

When talking about the rsync protocol, we need to distinguish between:

• protocol-level roles: “sender” and “receiver”
• TCP roles: “client” and “server”

All roles can be mixed and matched: both rsync clients (or servers!) can either send or receive.

Now that you know the terminology, let’s take a high-level look at the rsync protocol. We’ll look at protocol version 27, which is older but simpler, and which is the most widely supported protocol version, implemented by openrsync and other third-party implementations:

The rsync protocol can be divided into two phases:

1. In the first phase, the sender walks the local file tree to generate and send the file list to the receiver. The file list must be transferred in full, because both sides sort it by filename (later rsync protocol versions eliminate this synchronous sorting step).

2. In the second phase, concurrently:

• The receiver compares and requests each file in the file list. The receiver requests the full file when it didn’t exist on disk yet, or it will send checksums for the rsync hash search algorithm when the file already existed.
• The receiver receives file data from the sender. The sender answers the requests with just enough data to reconstruct the current file contents based on what’s already on the receiver.

The architecture makes it easy to implement the second phase in 3 separate processes, each of which sending to the network as fast as possible using heavy pipelining. This results in utilizing the available hardware resources (I/O, CPU, network) on sender and receiver to the fullest.

### Observing rsync’s transfer phases

When starting an rsync transfer, looking at the resource usage of both machines allows us to confirm our understanding of the rsync architecture, and to pin-point any bottlenecks:

1. phase: The rsync sender needs 17 seconds to walk the file system and send the file list. The rsync receiver reads from the network and writes into RAM during that time.
• This phase is random I/O (querying file system metadata) for the sender.
2. phase: Afterwards, the rsync sender reads from disk and sends to the network. The rsync receiver receives from the network and writes to disk.
• The receiver does roughly the same amount of random I/O as the sender did in phase 1, as it needs to create directories and request missing files.
• The sender does sequential disk reads and possibly checksum calculation, if the file(s) existed on the receiver side.

(Again, the above was captured using rsync protocol version 27, later rsync protocol versions don’t synchronize after completing phase 1, but instead interleave the phases more.)

Up until now, we have described the rsync protocol at a high level. Let’s zoom into the hash search step, which is what many people might associate with the term “rsync algorithm”.

When a file exists on both sides, rsync sender and receiver, the receiver first divides the file into blocks. The block size is a rounded square root of the file’s length. The receiver then sends the checksums of all blocks to the sender. In response, the sender finds matching blocks in the file and sends only the data needed to reconstruct the file on the receiver side.

Specifically, the sender goes through each byte of the file and tries to match existing receiver content. To make this less computationally expensive, rsync combines two checksums.

rsync first calculates what it calls the “sum1”, or “fast signature”. This is a small checksum (two uint16) that can be calculated with minimal effort for a rolling window over the file data. tridge rsync comes with SIMD implementations to further speed this up where possible.

Only if the sum1 matches will “sum2” (or “strong signature”) be calculated, a 16-byte MD4 hash. Newer protocol versions allow negotiating the hash algorithm and support the much faster xxhash algorithms.

If sum2 matches, the block is considered equal on both sides.

Hence, the best case for rsync is when a file has either not changed at all, or shares as many full blocks of content as possible with the old contents.

## Changing data sets

Now that we know how rsync works on the file level, let’s take a step back to the data set level.

The easiest situation is when you transfer a data set that is not currently changing. But what happens when the data set changes while your rsync transfer is running? Here are two examples.

debiman, the manpage generator powering manpages.debian.org is running on a Debian VM on which an rsync job periodically transfers the static manpage archive to different static web servers across the world. The rsync job and debiman are not sequenced in any way. Instead, debiman is careful to only ever atomically swap out files in its output directory, or add new files before it swaps out an updated index.

The second example, the PostgreSQL database management system, is the opposite situation: instead of having full control over how files are laid out, here I don’t have control over how files are written (this generalizes to any situation where the model of only ever replacing files is not feasible). The data files which my Postgres installation keeps on disk are not great to synchronize using rsync: they are large and frequently change. Instead, I now exempt them from my rsync transfer and use pg_dump(1) to create a snapshot of my databases instead.

To confirm rsync’s behavior regarding changing data sets in detail, I modified rsync to ask for confirmation between generating the file list and transferring the files. Here’s what I found:

• If files are added after rsync has transferred the file list, the new files will just not be part of the transfer.
• If a file vanishes between generating the file list and transfering the file, rsync exits with status code 24, which its manpage documents as “Partial transfer due to vanished source files”. My rsyncprom monitoring wrapper offers a flag to treat exit code 24 like exit code 0, because depending on the data set, vanishing files are expected.
• If a file’s contents change (no matter whether the file grows, shrinks, or is modified in-place) between generating the file list and the actual file transfer, that’s not a problem — rsync will transfer the file contents as it reads them once the transfer starts. Note that this might be an inconsistent view of the data, depending on the application.
• Ideally, don’t ever modify files within a data set that is rsynced. Instead, atomically move complete files into the data set.

Another way of phrasing the above is that data consistency is not something that rsync can in any way guarantee. It’s up to you to either live with the inconsistency (often a good-enough strategy!), or to add an extra step that ensures the data set you feed to rsync is consistent.

## Next up

The fourth article in this series is rsync, article 4: My own rsync implementation (To be published.)

## Appendix A: rsync confirmation hack

For verifying rsync’s behavior with regards to changing data sets, I checked out the following version:

% git clone https://github.com/WayneD/rsync/ rsync-changing-data-sets
% cd rsync-changing-data-sets
% git checkout v3.2.4
% ./configure
% make


Then, I modified flist.c to add a confirmation step between sending the file list and doing the actual file transfers:

diff --git i/flist.c w/flist.c
index 1ba306bc..98981f34 100644
--- i/flist.c
+++ w/flist.c
@@ -20,6 +20,8 @@
* with this program; if not, visit the http://fsf.org website.
*/

+#include <stdio.h>
+
#include "rsync.h"
#include "ifuncs.h"
#include "rounding.h"
@@ -2516,6 +2518,17 @@ struct file_list *send_file_list(int f, int argc, char *argv[])
if (DEBUG_GTE(FLIST, 2))
rprintf(FINFO, "send_file_list done\n");

+	char *line = NULL;
+	size_t llen = 0;
+	printf("file list sent. enter 'yes' to continue: ");
+	while ((nread = getline(&line, &llen, stdin)) != -1) {
+	  if (nread == strlen("yes\n") && strcasecmp(line, "yes\n") == 0) {
+	    break;
+	  }
+	  printf("enter 'yes' to continue: ");
+	}
+
if (inc_recurse) {
send_dir_depth = 1;


My rsync invocation is:

./rsync -av --debug=all4 --protocol=27 ~/i3/src /tmp/DEST/


It’s necessary to use an older protocol version to make rsync generate a full file list before starting the transfer. Later protocol versions interleave these parts of the protocol.

## 2022-07-02

### sECuREs website

#### rsync, article 2: Surroundings

This post is the second article in a series of blog posts about rsync, see the Series Overview.

Now that we know what to use rsync for, how can we best integrate rsync into monitoring and alerting, and on which operating systems does it work?

## Monitoring and alerting for rsync jobs using Prometheus

Once you have one or two important rsync jobs, it might make sense to alert when your job has not completed as expected.

I’m using Prometheus for all my monitoring and alerting.

Because Prometheus pulls metrics from its (typically always-running) targets, we need an extra component: the Prometheus Pushgateway. The Pushgateway stores metrics pushed by short-lived jobs like rsync transfers and makes them available to subsequent Prometheus pulls.

To integrate rsync with the Prometheus Pushgateway, I wrote rsyncprom, a small tool that wraps rsync, or parses rsync output supplied by you. Once rsync completes, rsyncprom pushes the rsync exit code and parsed statistics about the transfer to your Pushgateway.

### Prometheus server-side setup

First, I set up the Prometheus Pushgateway (via Docker and systemd) on my server.

Then, in my prometheus.conf file, I instruct Prometheus to pull data from my Pushgateway:

# prometheus.conf

rule_files:
- backups.rules.yml

scrape_configs:
# […]
- job_name: pushgateway
honor_labels: true
static_configs:
- targets: ['pushgateway:9091']


Finally, in backups.rules.yml, I configure an alert on the time series rsync_exit_code:

# backups.rules.yml

groups:
- name: backups.rules
rules:
expr: rsync_exit_code{job="rsync"} > 0
for: 1m
labels:
job: rsync
annotations:
description: rsync {{ $labels.instance }} is failing summary: rsync {{$labels.instance }} is failing


This alert will fire any time an rsync job monitored via rsyncprom exits with a non-zero exit code.

### rsync client-side setup

On each machine that runs rsync jobs I want to monitor, I first install rsyncprom:

go install github.com/stapelberg/rsyncprom/cmd/rsync-prom@latest


Then, I just wrap rsync transfers where it’s most convenient, for example in my crontab(5) :

# crontab -e
9 9 * * * /home/michael/go/bin/rsync-prom --job="cron" --instance="gphotos-sync@midna" -- /home/michael/gphotos-sync/sync.sh


The same wrapper technique works in shell scripts or systemd service files.

You can also provide rsync output from Go code (this example runs rsync via SSH).

### Monitoring architecture

Here’s how the whole setup looks like architecturally:

The rsync scheduler runs on a Raspberry Pi running gokrazy. The scheduler invokes the rsync job to back up websrv.zekjur.net via SSH and sends the output to Prometheus, which is running on a (different) server at an ISP.

### Monitoring dashboard

The grafana dashboard looks like this in action:

• The top left table shows the most recent rsync exit code, green means 0 (success).
• The top right graph shows rsync runtime (wall-clock time) over time. Long runtime can have any number of bottlenecks as the reason: network connections, storage devices, slow CPUs.
• The bottom left graph shows rsync dataset size over time. This allows you to quickly pinpoint transfers that are filling your disk up.
• The bottom right graph shows transferred bytes per rsync over time. The higher the value, the higher the amount of change in your data set between synchronization runs.

## rsync operating system availability

Now that we have learnt about a couple of typical use-cases, where can you use rsync to implement these use-cases? The answer is: in most environments, as rsync is widely available on different Linux and BSD versions.

Macs come with rsync available by default (but it’s an old, patched version), and OpenBSD comes with a BSD-licensed implementation called openrsync by default.

On Windows, you can use the Windows Subsystem for Linux.

Operating System Implementation Version
FreeBSD 13.1 (ports) tridge 3.2.3
OpenBSD 7.1 openrsync (7.1)
OpenBSD 7.1 (ports) tridge 3.2.4
NetBSD 9.2 (pkgsrc) tridge 3.2.4
Linux tridge repology
macOS tridge 2.6.9

## Next Up

The third article in this series is rsync, article 3: How does rsync work?. With rsync up and running, it’s time to take a peek under the hood of rsync to better understand how it works.

## 2022-06-18

### sECuREs website

#### rsync, article 1: Scenarios

This post is the first article in a series of blog posts about rsync, see the Series Overview.

To motivate why it makes sense to look at rsync, I present three scenarios for which I have come to appreciate rsync: DokuWiki transfers, Software deployment and Backups.

## Scenario: DokuWiki transfers using rsync

Recently, I set up a couple of tools for a website that is built on DokuWiki, such as a dead link checker and a statistics program. To avoid overloading the live website (and possibly causing spurious requests that interfere with statistics), I decided it would be best to run a separate copy of the DokuWiki installation locally. This requires synchronizing:

1. The PHP source code files of DokuWiki itself (including plugins and configuration)
2. One text file per wiki page, and all uploaded media files

A DokuWiki installation is exactly the kind of file tree that scp(1) cannot efficiently transfer (too many small files), but rsync(1) can! The rsync transfer only takes a few seconds, no matter if it’s a full download (can be simpler for batch jobs) or an incremental synchronization (more efficient for regular synchronizations like backups).

## Scenario: Software deployment using rsync

For smaller projects where I don’t publish new versions through Docker, I instead use a shell script to transfer and run my software on the server.

rsync is a great fit here, as it transfers many small files (static assets and templates) efficiently, only transfers the binaries that actually changed, and doesn’t mind if the binary file it’s uploading is currently running (contrary to scp(1) , for example).

To illustrate how such a script could look like, here’s my push script for Debian Code Search:

#!/bin/zsh
set -ex

# Asynchronously transfer assets while compiling:
(
ssh root@dcs 'for i in $(seq 0 5); do mkdir -p /srv/dcs/shard${i}/{src,idx}; done'
ssh root@dcs "adduser --disabled-password --gecos 'Debian Code Search' dcs || true"
rsync -r systemd/ root@dcs:/etc/systemd/system/ &
rsync -r cmd/dcs-web/templates/ root@dcs:/srv/dcs/templates/ &
rsync -r static/ root@dcs:/srv/dcs/static/ &
wait
) &

# Compile a new Debian Code Search version:
tmp=$(mktemp -d) mkdir$tmp/bin
GOBIN=$tmp/bin \ GOAMD64=v3 \ go install \ -ldflags '-X github.com/Debian/dcs/cmd/dcs-web/common.Version=$version' \
github.com/Debian/dcs/cmd/...

# Transfer the Debian Code Search binaries:
rsync \
$tmp/bin/dcs-{web,source-backend,package-importer,compute-ranking,feeder} \$tmp/bin/dcs \
root@dcs:/srv/dcs/bin/

# Wait for the asynchronous asset transfer to complete:
wait

# Restart Debian Code Search on the server:
UNITS=(dcs-package-importer.service dcs-source-backend.service dcs-compute-ranking.timer dcs-web.service)
ssh root@dcs systemctl daemon-reload \&\& \
systemctl enable ${UNITS} \; \ systemctl reset-failed${UNITS} \; \
systemctl restart ${UNITS} \; \ systemctl reload nginx rm -rf "${tmp?}"


## Scenario: Backups using rsync

The first backup system I used was bacula, which Wikipedia describes as an enterprise-level backup system. That certainly matches my impression, both in positive and negative ways: while bacula is very powerful, some seemingly common operations turn out quite complicated in bacula. Restoring a single file or directory tree from a backup was always more effort than I thought reasonable. For some reason, I often had to restore backup catalogs before I was able to access the backup contents (I don’t remember the exact details).

When moving apartment last time, I used the opportunity to change my backup strategy. Instead of using complicated custom software with its own volume file format (like bacula), I wanted backed-up files to be usable on the file system level with standard tools like rm, ls, cp, etc.

Working with files in a regular file system makes day-to-day usage easier, and also ensures that when my network storage hardware dies, I can just plug the hard disk into any PC, boot a Linux live system, and recover my data.

To back up machines onto my network storage PC’s file system, I ended up with a hand-written rsync wrapper script that copies the full file system of each machine into dated directory trees:

storage2# ls -l backup/midna/2022-05-27
bin   boot  etc  home  lib  lib64  media  opt
proc  root  run  sbin  sys  tmp    usr    var

storage2# ls -l backup/midna/2022-05-27/home/michael/configfiles/zshrc
-rw-r--r--. 7 1000 1000 14554 May  9 19:37 backup/midna/2022-05-27/home/michael/configfiles/zshrc


To revert my ~/.zshrc to an older version, I can scp(1) the file:

midna% scp storage2:/srv/backup/midna/2022-05-27/home/michael/configfiles/zshrc ~/configfiles/zshrc


To compare a whole older source tree, I can mount it using sshfs(1) :

midna% mkdir /tmp/2022-05-27-i3
midna% sshfs storage2:/srv/backup/midna/2022-05-27/$HOME/i3 /tmp/2022-05-27-i3 midna% diff -ur /tmp/2022-05-27-i3 ~/i3/  ### Incremental backups Of course, the idea is not to transfer the full machine contents every day, as that would quickly fill up my network storage’s 16 TB disk! Instead, we can use rsync’s --link-dest option to elegantly deduplicate files using file system hard links: backup/midna/2022-05-26 backup/midna/2022-05-27 # rsync --link-dest=2022-05-26  To check the de-duplication level, we can use du(1) , first on a single directory: storage2# du -hs 2022-05-27 113G 2022-05-27  …and then on two subsequent directories: storage2# du -hs 2022-05-25 2022-05-27 112G 2022-05-25 7.3G 2022-05-27  As you can see, the 2022-05-27 backup took 7.3 GB of disk space, and 104.7 GB were re-used from the previous backup(s). To print all files which have changed since the last backup, we can use: storage2# find 2022-05-27 -type f -links 1 -print  ### Limitation: file system compatibility A significant limitation of backups at the file level is that the destination file system (network storage) needs to support all the file system features used on the machines you are backing up. For example, if you use POSIX ACLs or Extended attributes (possibly for Capabilities or SELinux), you need to ensure that your backup file system has these features enabled, and that you are using rsync(1) ’s --xattrs (or -X for short) option. This can turn from a pitfall into a dealbreaker as soon as multiple operating systems are involved. For example, the rsync version on macOS has Apple-specific code to work with Apple resource forks and other extended attributes. It’s not clear to me whether macOS rsync can send files to Linux rsync, restore them, and end up with the same system state. Luckily, I am only interested in backing up Linux systems, or merely home directories of non-Linux systems, where no extended attributes are used. ### Downside: slow bulk operations (disk usage, deletion) The biggest downside of this architecture is that working with the directory trees in bulk can be very slow, especially when using a hard disk instead of an SSD. For example, deleting old backups can easily take many hours to multiple days (!). Sure, you can just let the rm command run in the background, but it’s annoying nevertheless. Even merely calculating the disk space usage of each directory tree is a painfully slow operation. I tried using stateful disk usage tools like duc, but it didn’t work reliably on my backups. In practice, I found that for tracking down large files, using ncdu(1) on any recent backup typically quickly shows the large file. In one case, I found var/lib/postgresql to consume many gigabytes. I excluded it in favor of using pg_dump(1) , which resulted in much smaller backups! Unfortunately, even when using an SSD, determining which files take up most space of a full backup takes a few minutes: storage2# time du -hs backup/midna/2022-06-09 742G backup/midna/2022-06-09 real 8m0.202s user 0m11.651s sys 2m0.731s  ### Backup transport (SSH) and scheduling To transfer data via rsync from the backup host to my network storage, I’m using SSH. Each machine’s SSH access is restricted in my network storage’s SSH authorized_keys(5) config file to not allow arbitrary commands, but to perform just a specific operation. The only allowed operation in my case is running rrsync (“restricted rsync”) in a container whose file system only contains the backup host’s sub directory, e.g. .websrv.zekjur.net: command="/bin/docker run --log-driver none -i -e SSH_ORIGINAL_COMMAND -v /srv/backup/websrv.zekjur.net:/srv/backup/websrv.zekjur.net stapelberg/docker-rsync /srv/backup/websrv.zekjur.net",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3…  To trigger such an SSH-protected rsync transfer remotely, I’m using a small custom scheduling program called dornröschen. The program arranges for all involved machines to be powered on (using Wake-on-LAN) and then starts rsync via another operation-restricted SSH connection. You could easily replace this with a cron job if you don’t care about WOL. The architecture looks like this: The operation-restricted SSH connection on each backup host is configured in SSH’s authorized_keys(5) config file: command="/root/backup-remote.pl",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3…  ## Next up The second article in this series is rsync, article 2: Surroundings. Now that we know what to use rsync for, how can we best integrate rsync into monitoring and alerting, and on which operating systems does it work? ## 2022-06-18 ### sECuREs website #### rsync: Series Overview For many years, I was only a casual user of rsync and used it mostly for one-off file transfers. Over time, I found rsync useful in more and more cases, and would recommend every computer user put this great tool into their toolbox 🛠 🧰 ! I’m publishing a series of blog posts about rsync: • rsync, article 1: Scenarios. To motivate why it makes sense to look at rsync, I present three scenarios for which I have come to appreciate rsync: DokuWiki transfers, Software deployment and Backups. • rsync, article 2: Surroundings. Now that we know what to use rsync for, how can we best integrate rsync into monitoring and alerting, and on which operating systems does it work? • rsync, article 3: How does rsync work?. With rsync up and running, it’s time to take a peek under the hood of rsync to better understand how it works. • rsync, article 4: My own rsync implementation (To be published.) ## 2022-05-29 ### RaumZeitLabor #### GnoPN 2022 Macht euch bereit für den großen Olfactory Reset auf der diesjährigen GnoblauchProgrammierNacht im RZL! Am 25. Juni treffen wir uns ab 18 Uhr, drehen den Gnoblauchregler auf 11 und trainieren bei Gnoblauch-Suppe, Gnoblauch-Brot und (anti-)alkoholischem Gnunk unseren Riech- und Geschmackssinn. Kommt vorbei, bringt gnoflhaltige Speisen mit genießt den Abend zusammen mit uns – Wir freuen uns auf euch! P.S.: Dies ist eine öffentliche Veranstaltung. Bitte beachtet das Hygienekonzept! ## 2022-05-23 ### Mero’s Blog #### Operator constraints in Go Let’s say you want to implement a sorting function in Go. Or perhaps a data structure like a binary search tree, providing ordered access to its elements. Because you want your code to be re-usable and type safe, you want to use type parameters. So you need a way to order user-provided types. There are multiple methods of doing that, with different trade-offs. Let’s talk about four in particular here: 1. constraints.Ordered 2. A method constraint 3. Taking a comparison function 4. Comparator types ## constraints.Ordered Go 1.18 has a mechanism to constrain a type parameter to all types which have the < operator defined on them. The types which have this operator are exactly all types whose underlying type is string or one of the predeclared integer and float types. So we can write a type set expressing that: type Integer interface { ~int | ~int8 | ~int16 | ~int32 | ~int64 | ~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 | ~uintptr } type Float interface { ~float32 | ~float64 } type Ordered interface { Integer | Float | ~string }  Because that’s a fairly common thing to want to do, there is already a package which contains these kinds of type sets. With this, you can write the signature of your sorting function or the definition of your search tree as: func Sort[T constraints.Ordered](s []T) { // … } type SearchTree[T constraints.Ordered] struct { // … }  The main advantage of this is that it works directly with predeclared types and simple types like time.Duration. It also is very clear. The main disadvantage is that it does not allow composite types like structs. And what if a user wants a different sorting order than the one implied by <? For example if they want to reverse the order or want specialized string collation. A multimedia library might want to sort “The Expanse” under E. And some letters sort differently depending on the language setting. constraints.Ordered is simple, but it also is inflexible. ## Method constraints We can use method constraints to allow more flexibility. This allows a user to implement whatever sorting order they want as a method on their type. We can write that constraint like this: type Lesser[T any] interface { // Less returns if the receiver is less than v. Less(v T) bool }  The type parameter is necessary because we have to refer to the receiver type itself in the Less method. This is hopefully clearer when we look at how this is used: func Sort[T Lesser[T]](s []T) { // … } func SearchTree[T Lesser[T]](s []T) { // … }  This allows the user of our library to customize the sorting order by defining a new type with a Less method: type ReverseInt int func (i ReverseInt) Less(j ReverseInt) bool { return j < i // order is reversed }  The disadvantage of this is that it requires some boiler plate on part of your user. Using a custom sorting order always requires defining a type with a method. They can’t use your code with predeclared types like int or string but always have to wrap it into a new type. Likewise if a type already has a natural comparison method but it is not called Less. For example time.Time is naturally sorted by time.Time.Before. For cases like that there needs to be a wrapper to rename the method. Whenever one of these wrappings happens your user might have to convert back and forth when passing data to or from your code. It also is a little bit more confusing than constraints.Ordered, as your user has to understand the purpose of the extra type parameter on Lesser. ## Passing a comparison function A simple way to get flexibility is to have the user pass us a function used for comparison directly: func Sort[T any](s []T, less func(T, T) bool) { // … } type SearchTree[T any] struct { Less func(T, T) bool // … } func NewSearchTree(less func(T, T) bool) *SearchTree[T] { // … return &SearchTree[T]{ Less: less, // … } }  This essentially abandons the idea of type constraints altogether. Our code works with any type and we directly pass around the custom behavior as funcs. Type parameters are only used to ensure that the arguments to those funcs are compatible. The advantage of this is maximum flexibility. Any type which already has a Less method like above can simply be used with this directly by using method expressions. Regardless of how the method is actually named: func main() { a := []time.Time{ /* … */ } Sort(a, time.Time.Before) }  There is also no boilerplate needed to customize sorting behavior: func main() { a := []int{42,23,1337} Sort(a, func(i, j int) bool { return j < i // reversed order }) }  And you can provide helpers for common customizations: func Reversed[T any](less func(T, T) bool) (greater func(T, T) bool) { return func(a, b T) bool { return less(b, a) } }  This approach is arguably also more correct than the one above because it decouples the type from the comparison used. If I use a SearchTree as a set datatype, there is no real reason why the elements in the set would be specific to the comparison used. It should be “a set of string” not “a set of MyCustomlyOrderedString”. This reflects the fact that with the method constraint, we have to convert back-and-forth when putting things into the container or taking it out again. The main disadvantage of this approach is that it means you can not have useful zero values. Your SearchTree type needs the Less field to be populated to work. So its zero value can not be used to represent an empty set. You cannot even lazily initialize it (which is a common trick to make types which need initialization have a useful zero value) because you don’t know what it should be. ## Comparator types There is a way to pass a function “statically”. That is, instead of passing around a func value, we can pass it as a type argument. The way to do that is to attach it as a method to a struct{} type: import "golang.org/x/exp/slices" type IntComparator struct{} func (IntComparator) Less(a, b int) bool { return a < b } func main() { a := []int{42,23,1337} less := IntComparator{}.Less // has type func(int, int) bool slices.SortFunc(a, less) }  Based on this, we can devise a mechanism to allow custom comparisons: // Comparator is a helper type used to compare two T values. type Comparator[T any] interface { ~struct{} Less(a, b T) bool } func Sort[C Comparator[T], T any](a []T) { var c C less := c.Less // has type func(T, T) bool // … } type SearchTree[C Comparator[T], T any] struct { // … }  The ~struct{} constraints any implementation of Comparator[T] to have underlying type struct{}. It is not strictly necessary, but it serves two purposes here: 1. It makes clear that Comparator[T] itself is not supposed to carry any state. It only exists to have its method called. 2. It ensures (as much as possible) that the zero value of C is safe to use. In particular, Comparator[T] would be a normal interface type. And it would have a Less method of the right type, so it would implement itself. But a zero Comparator[T] is nil and would always panic, if its method is called. An implication of this is that it is not possible to have a Comparator[T] which uses an arbitrary func value. The Less method can not rely on having access to a func to call, for this approach to work. But you can provide other helpers. This can also be used to combine this approach with the above ones: type LessOperator[T constraints.Ordered] struct{} func (LessOperator[T]) Less(a, b T) bool { return a < b } type LessMethod[T Lesser[T]] struct{} func (LessMethod[T]) Less(a, b T) bool { return a.Less(b) } type Reversed[C Comparator[T], T any] struct{} func (Reversed[C, T]) Less(a, b T) bool { var c C return c.Less(b, a) }  The advantage of this approach is that it makes the zero value of SearchTree[C, T] useful. For example, a SearchTree[LessOperator[int], int] can be used directly, without extra initialization. It also carries over the advantage of decoupling the comparison from the element type, which we got from accepting comparison functions. One disadvantage is that the comparator can never be inferred. It always has to be specified in the instantiation explicitly1. That’s similar to how we always had to pass a less function explicitly above. Another disadvantage is that this always requires defining a type for comparisons. Where with the comparison function we could define customizations (like reversing the order) inline with a func literal, this mechanism always requires a method. Lastly, this is arguably too clever for its own good. Understanding the purpose and idea behind the Comparator type is likely to trip up your users when reading the documentation. ## Summary We are left with these trade-offs: constraints.Ordered Lesser[T] func(T,T) bool Comparator[T] Predeclared types 👍 👎 👎 👎 Composite types 👎 👍 👍 👍 Custom order 👎 👍 👍 👍 Reversal helpers 👍 👎 👍 👍 Type boilerplate 👍 👎 👍 👎 Useful zero value 👍 👍 👎 👍 Type inference 👍 👍 👍 👎 Coupled Type/Order 👎 👎 👍 👍 Clarity 👍 🤷2 👍 👎 One thing standing out in this table is that there is no way to both support predeclared types and support user defined types. It would be great if there was a way to support multiple of these mechanisms using the same code. That is, it would be great if we could write something like // Ordered is a constraint to allow a type to be sorted. // If a Less method is present, it has precedent. type Ordered[T any] interface { constraints.Ordered | Lesser[T] }  Unfortunately, allowing this is harder than one might think. Until then, you might want to provide multiple APIs to allow your users more flexibility. The standard library currently seems to be converging on providing a constraints.Ordered version and a comparison function version. The latter gets a Func suffix to the name. See the experimental slices package for an example. 1. Though as we put the Comparator[T] type parameter first, we can infer T from the Comparator↩︎ 2. It’s a little bit worse, but probably fine. ↩︎ ## 2022-05-16 ### Mero’s Blog #### Calculating type sets is harder than you think Go 1.18 added the biggest and probably one of the most requested features of all time to the language: Generics. If you want a comprehensive introduction to the topic, there are many out there and I would personally recommend this talk I gave at the Frankfurt Gopher Meetup. This blog post is not an introduction to generics, though. It is about this sentence from the spec: Implementation restriction: A compiler need not report an error if an operand’s type is a type parameter with an empty type set. As an example, consider this interface: type C interface { int M() }  This constraint can never be satisfied. It says that a type has to be both the predeclared type int and have a method M(). But predeclared types in Go do not have any methods. So there is no type satisfying C and its type set is empty. The compiler accepts it just fine, though. That is what this clause from the spec is about. This decision might seem strange to you. After all, if a type set is empty, it would be very helpful to report that to the user. They obviously made a mistake - an empty type set can never be used as a constraint. A function using it could never be instantiated. I want to explain why that sentence is there and also go into a couple of related design decisions of the generics design. I’m trying to be expansive in my explanation, which means that you should not need any special knowledge to understand it. It also means, some of the information might be boring to you - feel free to skip the corresponding sections. That sentence is in the Go spec because it turns out to be hard to determine if a type set is empty. Hard enough, that the Go team did not want to require an implementation to solve that. Let’s see why. ## P vs. NP When we talk about whether or not a problem is hard, we often group problems into two big classes: 1. Problems which can be solved reasonably efficiently. This class is called P. 2. Problems which can be verified reasonably efficiently. This class is called NP. The first obvious follow up question is “what does ‘reasonably efficient’ mean?”. The answer to that is “there is an algorithm with a running time polynomial in its input size”1. The second obvious follow up question is “what’s the difference between ‘solving’ and ‘verifying’?”. Solving a problem means what you think it means: Finding a solution. If I give you a number and ask you to solve the factorization problem, I’m asking you to find a (non-trivial) factor of that number. Verifying a problem means that I give you a solution and I’m asking you if the solution is correct. For the factorization problem, I’d give you two numbers and ask you to verify that the second is a factor of the first. These two things are often very different in difficulty. If I ask you to give me a factor of 297863737, you probably know no better way than to sit down and try to divide it by a lot of numbers and see if it comes out evenly. But if I ask you to verify that 9883 is a factor of that number, you just have to do a bit of long division and it either divides it, or it does not. It turns out, that every problem which is efficiently solvable is also efficiently verifiable. You can just calculate the solution and compare it to the given one. So every problem in P is also in NP2. But it is a famously open question whether the opposite is true - that is, we don’t really know, if there are problems which are hard to solve but easy to verify. This is hard to know in general. Because us not having found an efficient algorithm to solve a problem does not mean there is none. But in practice we usually assume that there are some problems like that. One fact that helps us talk about hard problems, is that there are some problems which are as hard as possible in NP. That means we were able to prove that if you can solve one of these problems you can use that to solve any other problem in NP. These problems are called “NP-complete”. That is, to be frank, plain magic and explaining it is far beyond my capabilities. But it helps us to tell if a given problem is hard, by doing it the other way around. If solving problem X would enable us to solve one of these NP-complete problems then solving problem X is obviously itself NP-complete and therefore probably very hard. This is called a “proof by reduction”. One example of such problem is boolean satisfiability. And it is used very often to prove a problem is hard. ## SAT Imagine I give you a boolean function. The function has a bunch of bool arguments and returns bool, by joining its arguments with logical operators into a single expression. For example: func F(x, y, z bool) bool { return ((!x && y) || z) && (x || !y) }  If I give you values for these arguments, you can efficiently tell me if the formula evaluates to true or false. You just substitute them in and evaluate every operator. For example f(true, true, false) → ((!true && true) || false) && (true || !true) → ((false && true) || false) && (true || !true) → ((false && true) || false) && (true || false) → ((false && true) || false) && true → (false && true) || false → false && true → false  This takes at most one step per operator in the expression. So it takes a linear number of steps in the length of the input, which is very efficient. But if I only give you the function and ask you to find arguments which make it return true - or even to find out whether such arguments exist - you probably have to try out all possible input combinations to see if any of them does. That’s easy for three arguments. But for $$n$$ arguments there are $$2^n$$ possible assignments, so it takes exponential time in the number of arguments. The problem of finding arguments that makes such a function return true (or proving that no such arguments exists) is called “boolean satisfiability” and it is NP-complete. It is extremely important in what form the expression is given, though. Some forms make it pretty easy to solve, while others make it hard. For example, every expression can be rewritten into what is called a “Disjunctive Normal Form” (DNF). It is called that because it consists of a series of conjunction (&&) terms, joined together by disjunction (||) operators3: func F_DNF(x, y, z bool) bool { return (x && z) || (!y && z) }  (You can verify that this is the same function as above, by trying out all 8 input combinations) Each term has a subset of the arguments, possibly negated, joined by &&. The terms are then joined together using ||. Solving the satisfiability problem for an expression in DNF is easy: 1. Go through the individual terms. || is true if and only if either of its operands is true. So for each term: • If it contains both an argument and its negation (x && !x) it can never be true. Continue to the next term. • Otherwise, you can infer valid arguments from the term: • If it contains x, then we must pass true for x • If it contains !x, then we must pass false for x • If it contains neither, then what we pass for x does not matter and either value works. • The term then evaluates to true with these arguments, so the entire expression does. 2. If none of the terms can be made true, the function can never return true and there is no valid set of arguments. On the other hand, there is also a “Conjunctive Normal Form” (CNF). Here, the expression is a series of disjunction (||) terms, joined together with conjunction (&&) operators: func F_CNF(x, y, z bool) bool { return (!x || z) && (y || z) && (x || !y) }  (Again, you can verify that this is the same function) For this, the idea of our algorithm does not work. To find a solution, you have to take all terms into account simultaneously. You can’t just tackle them one by one. In fact, solving satisfiability on CNF (often abbreviated as “CNFSAT”) is NP-complete4. It turns out that every boolean function can be written as a single expression using only ||, && and !. In particular, every boolean function has a DNF and a CNF. Very often, when we want to prove a problem is hard, we do so by reducing CNFSAT to it. That’s what we will do for the problem of calculating type sets. But there is one more preamble we need. ## Sets and Satisfiability There is an important relationship between sets and boolean functions. Say we have a type T and a Universe which contains all possible values of T. If we have a func(T) bool, we can create a set from that, by looking at all objects for which the function returns true: var Universe Set[T] func MakeSet(f func(T) bool) Set[T] { s := make(Set[T]) for v := range Universe { if f(v) { s.Add(v) } } return s }  This set contains exactly all elements for which f is true. So calculating f(v) is equivalent to checking s.Contains(v). And checking if s is empty is equivalent to checking if f can ever return true. We can also go the other way around: func MakeFunc(s Set[T]) func(T) bool { return func(v T) bool { return s.Contains(v) } }  So in a sense func(T) bool and Set[T] are “the same thing”. We can transform a question about one into a question about the other and back. As we observed above it is important how a boolean function is given. To take that into account we have to also convert boolean operators into set operations: // Union(s, t) contains all elements which are in s *or* in t. func Union(s, t Set[T]) Set[T] { return MakeSet(func(v T) bool { return s.Contains(v) || t.Contains(v) }) } // Intersect(s, t) contains all elements which are in s *and* in t. func Intersect(s, t Set[T]) Set[T] { return MakeSet(func(v T) bool { return s.Contains(v) && t.Contains(v) }) } // Complement(s) contains all elements which are *not* in s. func Complement(s Set[T]) Set[T] { return MakeSet(func(v T) bool { return !s.Contains(v) }) }  And back: // Or creates a function which returns if f or g is true. func Or(f, g func(T) bool) func(T) bool { return MakeFunc(Union(MakeSet(f), MakeSet(g))) } // And creates a function which returns if f and g are true. func And(f, g func(T) bool) func(T) bool { return MakeFunc(Intersect(MakeSet(f), MakeSet(g))) } // Not creates a function which returns if f is false func Not(f func(T) bool) func(T) bool { return MakeFunc(Complement(MakeSet(f))) }  The takeaway from all of this is that constructing a set using Union, Intersect and Complement is really the same as writing a boolean function using ||, && and !. And proving that a set constructed in this way is empty is the same as proving that a corresponding boolean function is never true. And because checking that a boolean function is never true is NP-complete, so is checking if one of the sets constructed like this. With this, let us look at the specific sets we are interested in. ## Basic interfaces as type sets Interfaces in Go are used to describe sets of types. For example, the interface type S interface { X() Y() Z() }  is “the set of all types which have a method X() and a method Y() and a method Z()”. We can also express set intersection, using interface embedding: type S interface { X() } type T interface { Y() } type U interface { S T }  This expresses the intersection of S and T as an interface. Or we can view the property “has a method X()” as a boolean variable and think of this as the formula x && y. Surprisingly, there is also a limited form of negation. It happens implicitly, because a type can not have two different methods with the same name. Implicitly, if a type has a method X() it does not have a method X() int for example: type X interface { X() } type NotX interface{ X() int }  There is a small snag: A type can have neither a method X() nor have a method X() int. That’s why our negation operator is limited. Real boolean variables are always either true or false, whereas our negation also allows them to be neither. In mathematics we say that this logic language lacks the law of the excluded middle (also called “Tertium Non Datur” - “there is no third”). For this section, that does not matter. But we have to worry about it later. Because we have intersection and negation, we can express interfaces which could never be satisfied by any type (i.e. which describe an empty type set): interface{ X; NotX }  The compiler rejects such interfaces. But how can it do that? Did we not say above that checking if a set is empty is NP-complete? The reason this works is that we only have negation and conjunction (&&). So all the boolean expressions we can build with this language have the form x && y && !z  These expressions are in DNF! We have a term, which contains a couple of variables - possibly negated - and joins them together using &&. We don’t have ||, so there is only a single term. Solving satisfiability in DNF is easy, as we said. So with the language as we have described it so far, we can only express type sets which are easy to check for emptiness. ## Adding unions Go 1.18 extends the interface syntax. For our purposes, the important addition is the | operator: type S interface{ A | B }  This represents the set of all types which are in the union of the type sets A and B - that is, it is the set of all types which are in A or in B (or both). This means our language of expressible formulas now also includes a ||-operator - we have added set unions and set unions are equivalent to || in the language of formulas. What’s more, the form of our formula is now a conjunctive normal form - every line is a term of || and the lines are connected by &&: type X interface { X() } type NotX interface{ X() int } type Y interface { Y() } type NotY interface{ Y() int } type Z interface { Z() } type NotZ interface{ Z() int } // (!x || z) && (y || z) && (x || !y) type S interface { NotX | Z Y | Z X | NotY }  This is not quite enough to prove NP-completeness though, because of the snag above. If we want to prove that it is easy, it does not matter that a type can have neither method. But if we want to prove that it is hard, we really need an exact equivalence between boolean functions and type sets. So we need to guarantee that a type has one of our two contradictory methods. “Luckily”, the | operator gives us a way to fix that: type TertiumNonDatur interface { X | NotX Y | NotY Z | NotZ } // (!x || z) && (y || z) && (x || !y) type S interface { TertiumNonDatur NotX | Z Y | Z X | NotY }  Now any type which could possibly implement S must have either an X() or an X() int method, because it must implement TertiumNonDatur as well. So this extra interface helps us to get the law of the excluded middle into our language of type sets. With this, checking if a type set is empty is in general as hard as checking if an arbitrary boolean formula in CNF has no solution. As described above, that is NP-complete. Even worse, we want to define which operations are allowed on a type parameter by saying that it is allowed if every type in a type set supports it. However, that check is also NP-complete. The easy way to prove that is to observe that if a type set is empty, every operator should be allowed on a type parameter constrained by it. Because any statement about “every element of the empty set“ is true5. But this would mean that type-checking a generic function would be NP-complete. If an operator is used, we have to at least check if the type set of its constraint is empty. Which is NP-complete. ## Why do we care? A fair question is “why do we even care? Surely these cases are super exotic. In any real program, checking this is trivial”. That’s true, but there are still reasons to care: • Go has the goal of having a fast compiler. And importantly, one which is guaranteed to be fast for any program. If I give you a Go program, you can be reasonably sure that it compiles quickly, in a time frame predictable by the size of the input. If I can craft a program which compiles slowly - and may take longer than the lifetime of the universe - this is no longer true. This is especially important for environments like the Go playground, which regularly compiles untrusted code. • NP complete problems are notoriously hard to debug if they fail. If you use Linux, you might have occasionally run into a problem where you accidentally tried installing conflicting versions of some package. And if so, you might have noticed that your computer first chugged along for a while and then gave you an unhelpful error message about the conflict. And maybe you had trouble figuring out which packages declared the conflicting dependencies. This is typical for NP complete problems. As an exact solution is often too hard to compute, they rely on heuristics and randomization and it’s hard to work backwards from a failure. • We generally don’t want the correctness of a Go program to depend on the compiler used. That is, a program should not suddenly stop compiling because you used a different compiler or the compiler was updated to a new Go version. But NP-complete problems don’t allow us to calculate an exact solution. They always need some heuristic (even if it is just “give up after a bit”). If we don’t want the correctness of a program to be implementation defined, that heuristic must become part of the Go language specification. But these heuristics are very complex to describe. So we would have to spend a lot of room in the spec for something which does not give us a very large benefit. Note that Go also decided to restrict the version constraints a go.mod file can express, for exactly the same reasons. Go has a clear priority, not to require too complicated algorithms in its compilers and tooling. Not because they are hard to implement, but because the behavior of complicated algorithms also tends to be hard to understand for humans. So requiring to solve an NP-complete problem is out of the question. ## The fix Given that there must not be an NP-complete problem in the language specification and given that Go 1.18 was released, this problem must have somehow been solved. What changed is that the language for describing interfaces was limited from what I described above. Specifically Implementation restriction: A union (with more than one term) cannot contain the predeclared identifier comparable or interfaces that specify methods, or embed comparable or interfaces that specify methods. This disallows the main mechanism we used to map formulas to interfaces above. We can no longer express our TertiumNonDatur type, or the individual | terms of the formula, as the respective terms specify methods. Without specifying methods, we can’t get our “implicit negation” to work either. The hope is that this change (among a couple of others) is sufficient to ensure that we can always calculate type sets accurately. Which means I pulled a bit of a bait-and-switch: I said that calculating type sets is hard. But as they were actually released, they might not be. The reason I wrote this blog post anyways is to explain the kind of problems that exist in this area. It is easy to say we have solved this problem once and for all. But to be certain, someone should prove this - either by writing a proof that the problem is still hard or by writing an algorithm which solves it efficiently. There are also still discussions about changing the generics design. As one example, the limitations we introduced to fix all of this made one of the use cases from the design doc impossible to express. We might want to tweak the design to allow this use case. We have to look out in these discussions, so we don’t re-introduce NP-completeness. It took us some time to even detect it when the union operator was proposed. And there are other kinds of “implicit negations” in the Go language. For example, a struct can not have both a field and a method with the same name. Or being one type implies not being another type (so interface{int} implicitly negates interface{string}). All of which is to say that even if the problem might no longer be NP-complete - I hope that I convinced you it is still more complicated than you might have thought. If you want to discuss this further, you can find links to my social media on the bottom of this site. I want to thank my beta-readers for helping me improve this article. Namely arnehormann, @johanbrandhorst, @mvdan_, @_myitcv, @readcodesing, @rogpeppe and @zekjur. They took a frankly unreasonable chunk of time out of their day. And their suggestions were invaluable. 1. It should be pointed out, though, that “polynomial” can still be extremely inefficient. $$n^{1000}$$ still grows extremely fast, but is polynomial. And for many practical problems, even $$n^3$$ is intolerably slow. But for complicated reasons, there is a qualitatively important difference between “polynomial” and “exponential”6 run time. So you just have to trust me that the distinction makes sense. ↩︎ 2. These names might seem strange, by the way. P is easy to explain: It stands for “polynomial”. NP doesn’t mean “not polynomial” though. It means “non-deterministic polynomial”. A non-deterministic computer, in this context, is a hypothetical machine which can run arbitrarily many computations simultaneously. A program which can be verified efficiently by any computer can be solved efficiently by a non-deterministic one. It just tries out all possible solutions at the same time and returns a correct one. Thus, being able to verify a problem on a normal computer means being able to solve it on a non-deterministic one. That is why the two definitions of NP “verifiable by a classical computer” and “solvable by a non-deterministic computer” mean the same thing. ↩︎ 3. You might complain that it is hard to remember if the “disjunctive normal form” is a disjunction of conjunctions, or a conjunction of disjunctions - and that no one can remember which of these means && and which means || anyways. You would be correct. ↩︎ 4. You might wonder why we can’t just solve CNFSAT by transforming the formula into DNF and solving that. The answer is that the transformation can make the formula exponentially larger. So even though solving the problem on DNF is linear in the size the DNF formula, that size is exponential in the size of the CNF formula. So we still use exponential time in the size of the CNF formula. ↩︎ 5. This is called the principle of explosion or “ex falso quodlibet” (“from falsehoold follows anything”). Many people - including many first year math students - have anxieties and confusion around this principle and feel that it makes no sense. So I have little hope that I can make it palatable to you. But it is extremely important for mathematics to “work” and it really is the most reasonable way to set things up. Sorry. ↩︎ 6. Yes, I know that there are complexity classes between polynomial and exponential. Allow me the simplification. ↩︎ ## 2022-05-14 ### sECuREs website #### 25 Gbit/s HTTP and HTTPS download speeds Now that I recently upgraded my internet connection to 25 Gbit/s, I was curious how hard or easy it is to download files via HTTP and HTTPS over a 25 Gbit/s link. I don’t have another 25 Gbit/s connected machine other than my router, so I decided to build a little lab for tests like these 🧑‍🔬 ## Hardware and Software setup I found a Mellanox ConnectX-4 Lx for the comparatively low price of 204 CHF on digitec: To connect it to my router, I ordered a MikroTik XS+DA0003 SFP28/SFP+ Direct Attach Cable (DAC) with it. I installed the network card into my old workstation (on the right) and connected it with the 25 Gbit/s DAC to router7 (on the left): ### 25 Gbit/s router (left) Component Model Mainboard ASRock B550 Taichi CPU AMD Ryzen 5 5600X 6-Core Processor Network card Intel XXV710 Linux Linux 5.17.4 (router7) curl 7.83.0 from debian bookworm Go net/http from Go 1.18 router7 comes with TCP BBR enabled by default. ### Old workstation (right) Component Model Mainboard ASUS PRIME Z370-A CPU Intel i9-9900K CPU @ 3.60GHz Network card Mellanox ConnectX-4 Linux 5.17.5 (Arch Linux) nginx 1.21.6 caddy 2.4.3 ## Test preparation Before taking any measurements, I do one full download so that the file contents are entirely in the Linux page cache, and the measurements therefore no longer contain the speed of the disk. big.img in the tests below refers to the 35 GB test file I’m downloading, which consists of distri-disk.img repeated 5 times. ## T1: HTTP download speed (unencrypted) ### T1.1: Single TCP connection The simplest test is using just a single TCP connection, for example: curl -v -o /dev/null http://oldmidna:8080/distri/tmp/big.img ./httpget25 http://oldmidna:8080/distri/tmp/big.img  Client Server Gbit/s curl nginx 23.4 curl caddy 23.4 Go nginx 20 Go caddy 20.2 curl can saturate a 25 Gbit/s link without any trouble. The Go net/http package is slower and comes in at 20 Gbit/s. ### T1.2: Multiple TCP connections Running 4 of these downloads concurrently is a reliable and easy way to saturate a 25 Gbit/s link: for i in$(seq 0 4)
do
curl -v -o /dev/null http://oldmidna:8080/distri/tmp/big.img &
done

Client Server Gbit/s
curl nginx
23.4
23.4
Go nginx
23.4
23.4

At link speeds this high, enabling TLS slashes bandwidth in half or worse.

Using 4 TCP connections allows saturating a 25 Gbit/s link.

Caddy uses more CPU to serve files compared to nginx.

### T2.1: Single TCP connection

This test works the same as T1.1, but with a HTTPS URL:

curl -v -o /dev/null --insecure https://oldmidna:8443/distri/tmp/big.img
./httpget25 https://oldmidna:8443/distri/tmp/big.img

Client Server Gbit/s
curl nginx
8
7.5
Go nginx
12
7.2

### T2.2: Multiple TCP connections

This test works the same as T1.2, but with a HTTPS URL:

for i in $(seq 0 4) do curl -v -o /dev/null --insecure https://oldmidna:8443/distri/tmp/big.img & done  Curiously, the Go net/http client downloading from caddy cannot saturate a 25 Gbit/s link. Client Server Gbit/s curl nginx 23.4 curl caddy 23.4 Go nginx 23.4 Go caddy 21.6 ## T3: HTTPS with Kernel TLS (KTLS) Linux 4.13 got support for Kernel TLS back in 2017. nginx 1.21.4 introduced support for Kernel TLS, and they have a blog post on how to configure it. In terms of download speeds, there is no difference with or without KTLS. But, enabling KTLS noticeably reduces CPU usage, from ≈10% to a steady 2%. For even newer network cards such as the Mellanox ConnectX-6, the kernel can even offload TLS onto the network card! ### T3.1: Single TCP connection Client Server Gbit/s curl nginx 8 Go nginx 12 ### T3.2: Multiple TCP connections Client Server Gbit/s curl nginx 23.4 Go nginx 23.4 ## Conclusions When downloading from nginx with 1 TCP connection, with TLS encryption enabled (HTTPS), the Go net/http client is faster than curl! Caddy is slightly slower than nginx, which manifests itself in slower speeds with curl and even slower speeds with Go’s net/http. To max out 25 Gbit/s, even when using TLS encryption, just use 3 or more connections in parallel. This helps with HTTP and HTTPS, with any combination of client and server. ## Appendix Go net/http test program httpget25.go package main import ( "crypto/tls" "flag" "fmt" "io" "io/ioutil" "log" "net/http" ) func httpget25() error { http.DefaultTransport.(*http.Transport).TLSClientConfig = &tls.Config{InsecureSkipVerify: true} for _, arg := range flag.Args() { resp, err := http.Get(arg) if err != nil { return err } if resp.StatusCode != http.StatusOK { return fmt.Errorf("unexpected HTTP status code: want %v, got %v", http.StatusOK, resp.Status) } io.Copy(ioutil.Discard, resp.Body) } return nil } func main() { flag.Parse() if err := httpget25(); err != nil { log.Fatal(err) } }  Caddy config file Caddyfile { local_certs http_port 8080 https_port 8443 } http://oldmidna:8080 { file_server browse } https://oldmidna:8443 { file_server browse }  nginx installation instructions mkdir -p ~/lab25 cd ~/lab25 wget https://nginx.org/download/nginx-1.21.6.tar.gz tar tf nginx-1.21.6.tar.gz wget https://www.openssl.org/source/openssl-3.0.3.tar.gz tar xf openssl-3.0.3.tar.gz cd nginx-1.21.6 ./configure --with-http_ssl_module --with-http_v2_module --with-openssl=$HOME/lab25/openssl-3.0.3 --with-openssl-opt=enable-ktls
make -j8
cd objs
./nginx -c nginx.conf -p \$HOME/lab25

nginx config file nginx.conf
worker_processes  auto;

pid        logs/nginx.pid;

daemon off;

events {
worker_connections  1024;
}

http {
include       mime.types;
default_type  application/octet-stream;

access_log /home/michael/lab25/logs/access.log  combined;

sendfile        on;
sendfile_max_chunk 2m;

keepalive_timeout  65;

server {
listen       8080;
listen [::]:8080;
server_name  localhost;

root /srv/repo.distr1.org/;

location / {
index index.html index.htm;
}

error_page   500 502 503 504  /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}

location /distri {
autoindex on;
}
}

server {
listen 8443 ssl;
listen [::]:8443 ssl;
server_name localhost;

ssl_certificate nginx-ecc-p256.pem;
ssl_certificate_key nginx-ecc-p256.key;

#ssl_conf_command Options KTLS;

ssl_buffer_size 32768;
ssl_protocols TLSv1.3;

root /srv/repo.distr1.org/;

location / {
index index.html index.htm;
}

error_page   500 502 503 504  /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}

location /distri {
autoindex on;
}
}
}


## 2022-04-23

### sECuREs website

#### My upgrade to 25 Gbit/s Fiber To The Home

My favorite internet service provider, init7, is rolling out faster speeds with their infrastructure upgrade. Last week, the point of presence (POP) that my apartment’s fiber connection terminates in was upgraded, so now I am enjoying a 25 Gbit/s fiber internet connection!

## My first internet connections

(Feel free to skip right to the 25 Gbit/s announcement section, but I figured this would be a good point to reflect on the last 20 years of internet connections for me!)

The first internet connection that I consciously used was a symmetric DSL connection that my dad († 2020) shared between his home office and the rest of the house, which was around the year 2000. My dad was an early adopter and was connected to the internet well before then using dial up connections, but the SDSL connection in our second house was the first connection I remember using myself. It wasn’t particularly fast in terms of download speed — I think it delivered 256 kbit/s or something along those lines.

I encountered two surprises with this internet connection. The first surprise was that the upload speed (also 256 kbit/s — it was a symmetric connection) was faster than other people’s. At the time, even DSL connections with much higher download speeds were asymmetric (ADSL) and came with only 128 kbit/s upload. I learnt this while making first contact with file sharing: people kept asking me to stay online so that their transfers would complete more quickly.

The second surprise was the concept of a metered connection, specifically one where you pay more the more data you transfer. During the aforementioned file sharing experiments, it never crossed my mind that down- or uploading files could result in extra charges.

These two facts combined resulted in a 3000 € surprise bill for my dad!

Luckily, his approach to solve this problem wasn’t to restrict my internet usage, but rather to buy a cheap, separate ADSL flatrate line for the family (from Telekom, which he hated), while he kept the good SDSL metered line for his business.

The different connection speeds and characteristics have always interested me, and I used several other connections over the years, all of which felt limiting. The ADSL connection at my parent’s place started at 1 Mbit/s, was upgraded first to 3 Mbit/s, then 6 Mbit/s, and eventually reached its limit at 16 Mbit/s. When I spent one semester in Ireland, I had a 9 Mbit/s ADSL connection, and then later in Zürich I started out with a 15 Mbit/s ADSL connection.

All of these connections have always felt limiting, like peeking through the keyhole to see a rich world behind, but not being able to open the door. We’ve had to set up (and tune) traffic shaping, and coordinate when large downloads were okay.

## My first fiber connection

The dream was always to leave ADSL behind and get a fiber connection. The advantages are numerous: lower latency (ADSL came with 40 ms at the time), much higher bandwidth (possibly Gigabit/s?) and typically the connection was established via ethernet (instead of PPPoE). Most importantly, once the fiber is there, you can upgrade both ends to achieve higher speeds.

In Zürich, I managed to get a fiber connection set up in my apartment after fighting bureaucracy for many months. The issue was that there was no permission slip on file at Swisscom. Either the owner of my apartment never signed it to begin with, or it got lost. This is not a state that the online fiber availability checker can represent, but once you know it, the fix is easy: just have Swisscom send out the form again, have the owner sign it, and a few weeks later, you can order!

One wrinkle was that availability was only fixed in the Swisscom checker, and it was unclear when EWZ or other providers would get an updated data dump. Hence, I ordered Swisscom fiber to get things moving as quick as possible, and figured I could switch to a different provider later.

Here’s a picture of when the electrician pulled the fiber from the building entry endpoint (BEP) in the basement into my flat, from March 2014:

## Switching to fiber7

Only two months after I first got my fiber connection, init7 launched their fiber7 offering, and I switched from Swisscom to fiber7 as quickly as I could.

The switch was worth it in every single dimension:

• Swisscom charged over 200 CHF per month for a 1 Gbit/s download, 100 Mbit/s upload fiber connection. fiber7 costs only 65 CHF per month and comes with a symmetric 1 Gbit/s connection. (Other providers had to follow, so now symmetric is standard.)
• init7’s network performs much better than Swisscom’s: ping times dropped when I switched, and downloads are generally much faster. Note that this is with the same physical fiber line, so the difference is thanks to the numerous peerings that init7 maintains.
• init7 gives you a static IPv6 prefix (if you want) for free, and even delegates reverse DNS to your servers of choice.
• I enjoy init7’s unparalleled transparency. For example, check out the blog post about cost calculation if you’re ever curious if there could be a fiber7 POP in your area.

I have been very happy with my fiber7 connection ever since. What I wrote in 2014 regarding its performance remained true over the years — downloads were always fast for me, latencies were low, outages were rare (and came with good explanations).

I switched hardware multiple times over the years:

• First, I started with the Ubiquiti EdgeRouter Lite which could handle the full Gigabit line rate (the MikroTik router I originally ordered maxed out at about 500 Mbit/s!).
• In 2017, I switched to the Turris Omnia, an open hardware, open source software router that comes with automated updates.
• In July 2018, after my connectivity was broken due to an incompatibility between the DHCPv6 client on the Turris Omnia and fiber7, I started developing my own router7 in Go, my favorite programming language, mostly for fun, but also as a proof of concept for some cool features I think routers should have. For example, you can retro-actively start up Wireshark and open up a live ring buffer of the last few hours of network configuration traffic.

Notably, init7 encourages people to use their preferred router (Router Freedom).

## The 25 Gbit/s announcement

Over the years, other Swiss internet providers such as Swisscom and Salt introduced 10 Gbit/s offerings, so an obvious question was when init7 would follow suit.

People who were following init7 closely already knew that an infrastructure upgrade was coming. In 2020, init7 CEO Fredy Künzler disclosed that in 2021, init7 would start offering 10 Gbit/s.

What nobody expected before init7 announced it on their seventh birthday, however, was that init7 started offering not only 10 Gbit/s (Fiber7-X), but also 25 Gbit/s connections (Fiber7-X2)! 🤯

This was init7’s announcement on Twitter:

With this move, init7 has done it again: they introduced an offer that is better than anything else in the Swiss internet market, perhaps even world-wide!

One interesting aspect is init7’s so-called «MaxFix principle»: maximum speed for a fixed price. No matter if you’re using 1 Gbit/s or 25 Gbit/s, you pay the same monthly fee. init7’s approach is to make the maximum bandwidth available to you, limited only by your physical connection. This is such a breath of fresh air compared to other ISPs that think rate-limiting customers to ridiculously low speeds is somehow acceptable on an FTTH offering 🙄 (recent example).

If you’re curious about the infrastructure upgrade that enabled this change, check out init7’s blog post about their new POP infrastructure.

## What for? The use-case

A common first reaction to fast network connections is the question: “For what do you need so much bandwidth?”

Interestingly enough, I heard this question as recently as last year, in the context of a Gigabit internet connection! Some people can’t imagine using more than 100 Mbit/s. And sure, from a certain perspective, I get it — that 100 Mbit/s connection will not be overloaded any time soon.

But, looking at when a line is overloaded is only one aspect to take into account when deciding how fast of a connection you want.

There is a lower limit where you notice your connection is slow. Back in 2014, a 2 Mbit/s connection was noticeably slow for regular web browsing. These days, even a 10 Mbit/s connection is noticeably slow when re-opening my browser and loading a few tabs in parallel.

So what should you get? A 100 Mbit/s line? 500 Mbit/s? 1000 Mbit/s? Personally, I like to not worry about it and just get the fastest line I can, to reduce any and all wait times as much as possible, whenever possible. It’s a freeing feeling! Here are a few specific examples:

• If I have to wait only 17 minutes to download a PS5 game, that can make the difference between an evening waiting in frustration, or playing the title I’ve been waiting for.
• If I can run a daily backup (over the internet) of all servers I care about without worrying that the transfers interfere with my work video calls, that gives me peace of mind.
• If I can transfer a Debian Code Search index to my computer for debugging when needed, that might make the difference between being able to use the limited spare time I have to debug or improve Debian Code Search, or having to postpone that improvement until I find more time.

Aside from my distaste for waiting, a fast and reliable fiber connection enables self-hosting. In particular for my distri Linux project where I explore fast package installation, it’s very appealing to connect it to the internet on as fast a line as possible. I want to optimize all the parts: software architecture and implementation, hardware, and network connectivity. But, for my hobby project budget, getting even a 10 Gbit/s line at a server hoster is too expensive, let alone a 25 Gbit/s line!

Lastly, even if there isn’t really a need to have such a fast connection, I hope you can understand that after spending so many years of my life limited by slow connections, that I’ll happily take the opportunity of a faster connection whenever I can. Especially at no additional monthly cost!

Right after the announcement dropped, I wanted to prepare my side of the connection and therefore ordered a MikroTik CCR2004, the only router that init7 lists as compatible. I returned the MikroTik CCR2004 shortly afterwards, mostly because of its annoying fan regulation (spins up to top speed for about 1 minute every hour or so), and also because MikroTik seems to have made no progress at all since I last used their products almost 10 years ago. Table-stakes features such as DNS resolution for hostnames within the local network are still not included!

I expect that more and more embedded devices with SFP28 slots (like the MikroTik CCR2004) will become available over the next few years (hopefully with better fan control!), but at the moment, the selection seems to be rather small.

For my router, I instead went with a custom PC build. Having more space available means I can run larger, slow-spinning fans that are not as loud. Plugging in high-end Intel network cards (2 × 25 Gbit/s, and 4 × 10 Gbit/s on the other one) turns a PC into a 25 Gbit/s capable router.

With my equipment sorted out, I figured it was time to actually place the order. I wasn’t in a hurry to order, because it was clear that it would be months before my POP could be upgraded. But, it can’t hurt to register my interest (just in case it influences the POP upgrade plan). Shortly after, I got back this email from init7 where they promised to send me the SFP module via post:

And sure enough, a few days later, I received the SFP28 module in the mail:

With my router build, and the SFP28 module, I had everything I needed for my side of the connection.

The other side of the connection was originally planned to be upgraded in fall 2021, but the global supply shortage imposed various delays on the schedule.

Eventually, the fiber7 POP list showed an upgrade date of April 2022 for my POP, and that turned out to be correct.

## The night of the upgrade

I had read Pim’s blog post on the upgrade of the 1790BRE POP in Brüttisellen, which contains a lot of super interesting details, so definitely check that one out, too!

Being able to plug in the SFP module into the new POP infrastructure yourself (like Pim did) sounded super cool to me, so I decided to reach out, and init7 actually agreed to let me stop by to plug in “my” fiber and SFP module!

Giddy with excitement, I left my place at just before 23:00 for a short walk to the POP building, which I had seen many times before, but never from the inside.

Patrick, the init7 engineer met me in front of the building and explained “Hey! You wrote my window manager!” — what a coincidence :-). Luckily I had packed some i3 stickers that I could hand him as a small thank you.

Inside, I met the other init7 employee working on this upgrade. Pascal, init7’s CTO, was coordinating everything remotely.

Standing in front of init7’s rack, I spotted the old Cisco switch (at the bottom), and the new Cisco C9500-48Y4C switches that were already prepared (at the top). The SFP modules are for customers who decided to upgrade to 10 or 25 Gbit/s, whereas for the others, the old SFP modules would be re-used:

We then spent the next hour pulling out fiber cables and SFP modules out of the old Cisco switch, and plugging them back into the new Cisco switch.

Just like the init7 engineer working with me (who is usually a software guy, too, he explained), I enjoy doing physical labor from time to time for variety. Especially with nice hardware like this, and when it’s for a good cause (faster internet)! It’s almost meditative, in a way, and I enjoyed the nice conversation we had while we were both moving the connections.

After completing about half of the upgrade (the top half of the old Cisco switch), I walked back to my place — still blissfully smiling all the way — to turn up my end of the connection while the others were still on site and could fix any mistakes.

After switching my uplink0 network interface to the faster network card, it also took a full reboot of my router for some reason, but then it recognized the SFP28 module without trouble and successfully established a 25 Gbit/s link! 🎉 🥳

I did a quick speed test to confirm and called it a night.

## Speed tests / benchmarks

Just like in the early days of Gigabit connections, my internet connection is now faster than the connection of many servers. It’s a luxury problem to be sure, but in case you’re curious how far a 25 Gbit/s connection gets you in the internet, in this section I collected some speed test results.

### Ookla speedtest.net

speedtest.net (run by Ookla) is the best way to measure fast connections that I’m aware of.

Here is my first 25 Gbit/s speedtest, which was run using the init7 speedtest server:

I also ran speedtests to all other servers that were listed for the broader Zürich area at the time, using the tamasboros/ookla-speedtest Docker image. As you can see, most speedtest servers are connected with a 10 Gbit/s port, and some (GGA Maur) even only with a 1 Gbit/s port:

Init7 AG - Winterthur 1.45 23530.27 23031.24
fdcservers.net 18.15 9386.29 1262.92
GIB-Solutions AG - Schlieren 6.64 9154.12 2207.68
Monzoon Networks AG 0.74 8874.85 6427.66
Glattwerk AG 0.92 8719.04 4008.28
AltusHost B.V. 0.80 8373.34 8518.90
iWay AG - Zurich 2.13 8337.56 8194.89
Sunrise Communication AG 9.04 8279.60 3109.34
31173 Services AB 18.69 8279.75 1503.92
Wingo 4.25 6179.57 5248.36
Netrics Zürich AG 0.74 7910.78 8770.19
Cloudflare - Zurich 1.14 7410.97 2218.88
Netprotect - Zurich 0.87 7034.62 8948.01
C41.ch - Zurich 9.90 6792.60 690.33
Goldenphone GmbH 18.91 3116.32 659.23
GGA Maur 0.99 940.24 941.24

### Linux mirrors

For a few popular Linux distributions, I went through the mirror list and tried all servers in Switzerland and Germany. Only one or two would be able to deliver files at more than 1 Gigabit/s. Other miror servers were either capped at 1 Gigabit/s, or wouldn’t even reach that (slow disks?).

Here are the fast ones:

• Debian: mirror1.infomaniak.com and mirror2.infomaniak.com
• Arch Linux: mirror.puzzle.ch
• Fedora Linux: mirrors.xtom.de
• Ubuntu Linux: mirror.netcologne.de and ubuntu.ch.altushost.com

### iperf3

Using iperf3 -P 2 -c speedtest.init7.net, iperf3 shows 23 Gbit/s:

[SUM]   0.00-10.00  sec  26.9 GBytes  23.1 Gbits/sec  597             sender
[SUM]   0.00-10.00  sec  26.9 GBytes  23.1 Gbits/sec                  receiver


It’s hard to find public iperf3 servers that are connected with a fast-enough port. I could only find one that claims to be connected via a 40 Gbit/s port, but it was unavailable when I wanted to test.

### Interested in a speed test?

Do you have a ≥ 10 Gbit/s line in Europe, too? Are you interested in a speed test? Reach out to me and we can set something up.

## Conclusion

What an exciting time to be an init7 customer! I still can’t quite believe that I now have a 25 Gbit/s connection in 2022, and it feels like I’m living 10 years in the future.

Thank you to Fredy, Pascal, Patrick, and all the other former and current init7 employees for showing how to run an amazing Internet Service Provider. Thank you for letting me peek behind the curtains, and keep up the good work! 💪

If you want to learn more, check out Pascal’s talk at DENOG:

## 2022-04-08

### RaumZeitLabor

#### Obatzda Wars - The Emmentaler stinks back

Liebe Cheddar-Ritter,

die diesjährige Jahresversammlung der Rotwein-Rebell Alliance findet am Samstag, den 30. April 2022, auf unserem Heimatplaneten RZL unter dem Motto “RoqueFort One” statt. Lasst uns gemeinsam ab 18.30 Uhr gegen die intergalaktosische Eintönigkeit anessen! Auch die Fontina Band (“Esst den selben Tomme nochmal… Den selben Tomme nochmal”) wird selbstverständlich mit von der Partie sein.

Bitte meldet euch bis zum 23. April 2022 per Mail an und schreibt dazu, ob ihr noch Luke SkyVachard oder CacioBacca mitbringt. Wir bitten um Zahlung einer Käsepauschale von mindestens 12 Idiazabalen Credits pro Kopf.

May the Scamorza be with you F-Leia-Rattie und die Ha’Niolos

P.S.: Dies ist eine öffentliche Veranstaltung. Bitte beachtet das Hygienekonzept!

## 2022-03-19

### sECuREs website

#### Smart Home components 🏠

I have tried a bunch of different Smart Home products over the last few years and figured I would give an overview of which ones I liked, which ones I disliked, and how I would go about selecting good Smart Home products to buy.

## Smart Lights

To me, the primary advantage of Smart Lights is the flexibility in where you place extra light switches, and the extra functions that become much easier with Smart Lights.

For example, I have added an extra light switch in the bed and next to the couch, without having to have an electrician tear up the walls to add more wiring. An “all-off” button is super handy at the end of the day or when watching a movie.

Other attractive use-cases include controlling lights based on time of the day, based on whether people are home, or based on a motion sensor.

I used the RGB color light bulb version of all of the below systems. In practice, we typically don’t change the color much, but it is nice to be able to adjust the color and brightness to something that fits the respective room. And, every once in a while, scenes that use color are fun!

### Moved away from: IKEA TRÅDFRI 👎

The first smart light system I used was IKEA TRÅDFRI. I figured as a system with a large user base, they would be inclined to improve it over time, and compatibility should be more likely than with other, smaller vendors.

Unfortunately the system is pretty much unchanged from when I first bought it many years ago.

You can easily find documentation about the API for using the TRÅDFRI gateway programmatically, but when I looked for available Go packages, I decided to use COAP and DTLS myself back in 2019 for lack of an attractive Go package.

The light switches are good in terms of features, and easy to install: you can just remove the old switch and glue the TRÅDFRI switch over the existing switch.

The downside of the light switches is that they are flimsy: because the switch is magnetically held in place in its case, it can easily fall on the floor when you bump against it.

Pairing the devices was always tricky for me. It got easier when I turned off all other ZigBee devices in my apartment before doing anything with IKEA devices.

At multiple points, the devices lost their pairing. It might have been when they ran out of battery.

The battery lifetime of the light switches was very poor — only about a year on average. They use the CR2032 form factor, which my charger does not support, so I couldn’t use rechargables.

Swapping out the batteries and re-pairing the system every year or so quickly becomes tedious!

### Moved away from: Shelly Bulb 👎

Because I also bought some Shelly 1L smart relays, I figured I’d give the Shelly Bulb a try.

Instead of ZigBee, the Shelly Bulbs use WiFi. This makes them easy to get into your home network and does not require a separate gateway.

At 2 bulbs per room+hallway, and 2 buttons each, that sums up to having 16 extra devices in your WiFi network. This wasn’t a problem for me in practice, but depending on how stable your WiFi network is, it might be a concern.

Notably, this also means your lights can’t be controlled while your WiFi is unavailable.

In terms of physical light switches, you’ll need to use a separate product such as the Shelly Button. This is the weakest point of the system. The latency is noticeable, even when configuring a static IP address, which does make things better, but still not good. The Shelly Button is extremely simple, so dimming has to be emulated with double or triple-press actions.

Given that one typically interacts with this system multiple times a day via its switches, I think it makes sense to chose a system that has good switches.

On the plus side, the Shelly Button uses a rechargable battery that can be charged from a USB power bank, which is a concept I really like.

### Philips Hue 👍

After the Shelly Bulb, I figured I’d try Philips Hue. It’s by far the most expensive system of the ones I have tried, but also by far the most polished and user-friendly.

People recommended the Feller Smart Light Control switches, which use energy harvesting (from you clicking them!) and hence don’t require a battery.

This makes it easy to place them anywhere, like next to the couch in the picture on the left.

Feller recommends extending existing installations by buying the next-larger mounting plate. Extending the box in the wall is not required, as no wires or in-wall space are needed. Drilling new holes for extra screws is required for stability, but that’s a lot more doable than extending the whole box. Here are some pictures before, during and after the installation:

### Shelly 1L 👍

The Shelly 1L is a very interesting device. It goes behind your existing device into the wall and makes it smart!

This allows you to make smart any existing lights that can’t easily be replaced by smart lights, for example a bathroom light built into the bathroom mirror cabinet.

You can also make existing light switches smart if you like the ones you already have and can’t exchange them.

Another use-case is to easily connect buttons or sensors into your network, for example door bells or door sensors.

The Shelly 1L is special in that this specific model can be installed when all you have is a live wire (i.e. wiring for a light switch).

One potential issue is that depending on the configuration and connected device’s power usage, the Shelly might emit a slight hum noise. So, don’t install one right next to your bed.

Another limitation is that while the Shelly does work with both, light switches (changes state) and light buttons (generates an impulse), it can only distinguish between short and long press events when you use a light button. Newer light switches from Feller can be re-configured to function as a button, but if your model is too old you might need to replace a light switch with a button.

One weird issue I ran into was that after installing a new bathroom mirror cabinet, the relay of the connected Shelly 1L would no longer function correctly — the light just remained on, even when turning it off via the Shelly. I read on the Shelly forum that this could be caused by running the Shelly upside-down, and indeed, after turning it around, it started to work again!

## Smart Heating

Smart Heating systems are often advertised to save cost. I wanted to try it out, and was also interested in the temperature logging because my apartment is on the more humid side and I wanted some data to optimize the situation.

### HomeMatic 😐

I bought some HomeMatic temperature sensors and heating valve drives back in 2017. The hardware feels solid and was easy enough to install.

One massive downside of the system was the poor software quality of their Central Control Unit (CCU2). The web interface was super slow, looked very dated, and the whole thing kept running out of memory every 2 weeks or so. It was so bad that I re-implemented my own CCU in Go. I hear that by now, they have a new and better Control Unit version, though.

So far, one valve drive has failed with error code F1; I replaced it with a new one.

Turns out smart control of our heating does not seem to make any measurable difference. The rooms feel the same as before. No money is saved because the utility bill is divided equally among all tenants across the building (which seems to be standard in Switzerland), not billed for individual usage.

So, overall, I would not install smart heating valve drives again. The temperature sensors I still keep an eye on from time to time, but there are cheaper options if you only need temperature!

## Smart Lock

### Nuki 👍

During the pandemic, I was receiving packages at home and hence I was relying on my door bell much more than usual. Hence, I was looking for a way to make it smarter!

The first device I got was the Nuki Opener, a smart intercom system. It allows you to get notifications on your phone when the doorbell is rung, and to unlock the door from your phone.

I got this device because it was specifically marketed as compatible with the BTicino intercom system our house uses. Unfortunately, this turned out to be incorrect, so I ended up building a hardware-modified intercom unit that is connected to the Nuki Opener in analogue mode.

Once it actually works, it’s a convenient system, and having your doorbell generate desktop notifications with sound is just super useful when wearing headphones! Strongly recommended.

As you can see on the pictures, I’m powering the Nuki Opener via USB. It normally runs on batteries, but I want to minimize battery usage and swapping. A built-in rechargeable battery like in the Shelly devices would be a neat improvement to the Nuki Opener, so that the device could still work during power outages!

After I had the Nuki Opener, I also added a Nuki Smart Lock so that we can not only open the house front door, but also the apartment door itself in case one of us forgets their key.

The Nuki Smart Lock was easy to install and works great. It also shows with an elegant LED ring whether the door is currently locked or not, which I find handy.

## Motion Sensors

Not having to turn on lights myself is something I find convenient, in particular in the kitchen, but also in the bathroom. When carrying plates or glasses into the kitchen, it’s nice to have the lights turn on while my hands are full.

### Moved away from: Feller Motion Sensors 😐

First I tried Feller’s Motion Sensors, because they physically fit well into the existing Feller light switch installation:

But, their limitations made me move away from them quickly: while you can change one or two basic settings, you cannot, for example, disable the motion sensor after a certain time of day, or manually disable it for a certain time period.

Also, because the device is installed in a fixed position (determined by where your light switch is), it isn’t necessarily in the best place to spot all the motion you want to detect.

### Shelly Motion 👍

The Shelly Motion Sensor seems like a good motion sensor to me! It has a number of useful settings and can easily trigger any REST API endpoint or can be used via MQTT.

Like with the Shelly Button, this device has a built-in rechargeable battery that can be charged via USB. Depending on the location of the sensor, you can either attach a USB powerbank once a year, or remove the sensor from its fixture and charge it elsewhere.

The positioning of the Shelly Motion can either be easy (as it was in my kitchen) or tricky to get right (in my bathroom). I don’t know if other motion sensors are better in terms of range.

One thing to note is that the Shelly Motion only reports state changes (motion start or motion end), and no continuous events while motion is detected.

For my kitchen, my regelwerk code directly translates motion on/off into light on/off commands (to Philips Hue and Shelly 1L), with the exception that a long-press turns off all motion control for the next 10 minutes. The granularity of the Shelly Motion is to report after no motion for 1 minute, which works well for me for the kitchen.

For my bathroom, I don’t want the lights to immediately turn off when no motion is detected anymore, to err on the side of not turning off the light while people are still using the bathroom and are just not seen by the motion sensor. To implement that, I found that using the Shelly 1L’s timer functionality works best. So, in my configuration, motion on means lights on, and motion off means lights on for 10 minutes, then off. Turning off the light manually disables that logic.

Note that the Shelly Motion should really be mounted in the orientation recommended by the manual. When the motion sensor lays on the side (or is upside down), detection is much poorer.

## Smart Power Plug

A smart plug is an easy way to turn off a power-hungry device while you’re away, to make a lamp smart, or to power on a connected device like a kettle to boil water for making a tea.

My current use-cases are saving power for the stereo sound system connected to my PC, and saving power by powering up the devices in my gokrazy Continuous Integration test environment on-demand only.

While there are tons of vendors selling smart plugs, the selection narrows considerably when you look for one with a Swiss power plug.

### HomeMatic 👎

The HomeMatic smart plug is expensive (55 CHF) and super bulky! As you can see, even if you connect it at the very end of a power strip, it still blocks the adjacent connector.

Worse: the way it’s built (bulky side pointing away from the earth pin), I can’t even insert it into 2 of the 3 power strips you see on the picture.

Somehow, even though it’s so bulky, the device feels flimsy at the same time. I’m never 100% sure if the plug is inserted fully and correctly, and it’s easy to accidentally turn off power when bumping against the smart plug with your foot.

Because it’s a HomeMatic device, you need a working Central Control Unit (CCU) to control it programmatically. Conceptually, I prefer smart plugs that can be used with a REST or MQTT API.

The only upside of this smart plug is that it can measure power. I occasionally use it for that.

### Sonoff 😐

The Sonoff S26 are much cheaper (≈12 USD when I bought mine) and come in a Swiss plug variant. Contrary to the HomeMatic ones, the Sonoff smart plugs are built “the right way around”, meaning I can plug them into many Swiss power strips. Unfortunately, they also block adjacent connectors, but at least not as many as the HomeMatic.

The Open Source firmware Tasmota supports the Sonoff S26, but flashing them is a painful experience. You can’t do it over the air; you need to access rather small serial console pins inside the device.

Once you have them flashed with Tasmota, the devices work great.

One feature they lack is power measurement.

I would love to find a smart plug with a Swiss plug, that supports power measurement, and that is compatible with Tasmota (or builtin MQTT support), but until that product comes along, the Sonoff S26 are what I’m going to use.

## Architecture as of March 2022

Here is an architecture diagram of the devices I’m currently using:

To tie these different systems together, I use a Raspberry Pi running gokrazy, which in turn runs my regelwerk program. regelwerk only talks to MQTT, so all the different devices are connected to MQTT using small adapter programs such as my hue2mqtt or shelly2mqtt.

A more off-the-shelf solution would be to use Node-RED, if you want to do a little programming, or Home Assistant if you want to do barely any programming.

## My strategy for selecting components

I don’t look for one vendor or one system that has components for everything. Instead, I chose the leading vendor in each domain. Compatibility between systems is generally poor, so I try to keep my compatibility requirements to a minimum.

To programmatically interact with the devices, the best bet are devices that are designed to be developer-friendly (e.g. Shelly devices support MQTT) or at least have an official API with modules in my favorite programming language (e.g. Philips Hue). In terms of API, I expect to talk to a gateway device in my local network — I tried talking e.g. Zigbee directly but found it inconvenient due to poor software support, sparse documentation and strange compatibility issues.

Direct device-to-device communication is nice from a reliability perspective, but on some battery-powered systems you pay for it with reduced battery runtime. For example, when using multiple light switches for the same room with IKEA TRÅDFRI, you pair one to the other, which also makes all signals go through it.

If possible, I select devices that have an open firmware available. Ideally, I can keep using the vendor’s firmware, but if the vendor unexpectedly goes out of business, it’s handy to have an alternative firmware available. Also, if the devices require a cloud service to function, using open firmware typically allows using them in your local network.

I have come to avoid WiFi where latency is important, e.g. between light switches and lights.

I stopped looking at the price too much and instead look at the user experience. Smart home is about comfort and convenience, and if a product doesn’t delight in daily usage, why bother with it? Targeting the high end of mid-range devices seems like the sweet spot to me. Avoid anything more expensive than that, though — established players often re-brand third-party solutions and you only pay for the company name, not quality.

## 2022-02-20

### michael-herbst.com

#### RWTH Julia workshop 2022

Last Thursday and Friday (17/18 February) I taught an introductory course to the Julia programming language. The course took place in virtual format and to my great surprise around 90 people from all over the world ended up joining. Luckily I had a small support team consisting of Gaspard Kemlin and Lambert Theissen (thanks!) who took care of some of the organisational aspects in running the zoom session. Overall it was a lot of fun to spread the word about the Julia programming language with so many curious listeners with interested and supporting questions.

Thanks to everyone who tuned in and thanks to everyone who gave constructive feedback at the end. I'm very much encouraged by the fact that all of you, unanimously, would recommend the workshop to your peers. In that sense: Please go spread the word as I'm already looking forward to the next occasion I'll have to teach about Julia!

## 2022-01-25

### michael-herbst.com

#### GdR nbody general meeting

About two weeks ago, from 10 till 13 Jan 2022 I was at the annual meeting of the French research group on many-body phaenomena, the GDR nbody. Originally scheduled to take place in person in Toulouse the Corona-related developments unfortunately caused the organisers to switch to a virtual event on short notice. Albeit I would have loved to return to Toulouse and see everyone in person, it was still an opportunity to catch up. In my talk at the occasion I presented on the {filename}/articles/Publications/2021-adaptive-damping.md, which Antoine Levitt and myself recently developed, see the submitted article on arxiv.

## 2022-01-15

### sECuREs website

#### My 2022 high-end Linux PC 🐧

I finally managed to get my hands on some DDR5 RAM to complete my Intel i9-12900 high-end PC build! This article contains the exact component list if you’re interested in doing a similar build.

Usually, I try to stay on the latest Intel CPU generation when possible. But I decided to skip the i9-10900 (Comet Lake) and i9-11900 (Rocket Lake) series entirely, largely because they were still stuck on Intel’s 14nm manufacturing process and didn’t seem to offer much improvement.

The new i9-12900 (Alder Lake) delivered good benchmark results and is manufactured with the much newer Intel 7 process, so I was curious: would an upgrade be worth it?

## Components

Price Type Article
196 CHF Case Fractal Define 7 Solid (Midi Tower)
89 CHF Power Supply Corsair RM750x 2018 (750 W)
293 CHF Mainboard ASUS PRIME Z690-A (LGA1700, ATX)
646 CHF CPU Intel Core i9-12900K
113 CHF CPU fan Noctua NH-U12A
30 CHF Case fan Noctua NF-A14 PWM (140 m)
770 CHF RAM Corsair Vengeance CMK32GX5M2A4800C40 (64 GB)
408 CHF Disk WD Black SN850 (2 TB)
605 CHF GPU GeForce RTX 2070
65 EUR Network Mellanox ConnectX-3 (10 Gbit/s)

## Fan compatibility

The Noctua NH-U12A CPU fan required an adapter (“Noctua NM-i17xx-MP78 SecuFirm2 mounting kit”) to be compatible with the Intel LGA1700 socket. I requested the adapter on Noctua’s Website on November 5th, and it arrived November 26th.

## Fractal Define 7 case

Anytime you need to access a PC’s components, you’ll deal with its case. Especially for a self-built PC, the case you chose determines how easy it is to assemble and later modify your PC.

Over the years, I have come to value the following aspects of a PC case:

1. No extra effort should be required for the case to be as quiet as possible.
2. The case should not have any sharp corners (no danger of injury!).
3. The case should provide just enough space for easy access to your components.
4. The more support the case has to encourage clean cable routing, the better.
5. USB3 front panel headers should be included.

I have been using Fractal cases for the past few years and came to generally prefer them over other brands because of their good build quality.

Hence I’m happy to report that the Fractal Define 7 (their latest generation at the time of writing) ticks all of the above boxes!

The case and power supply work well together in terms of cable management. It was a joy to route the cables.

It’s very easy to open the case doors (they clip in place), or remove the front panel. This is definitely the best PC case I have seen so far in terms of quick and easy access.

Here’s how clean the inside looks. Most cables are routed with very short ways to the back, where the case offers plenty of convenient cable guides:

You might also find this YouTube video review of the Fractal Define 7 interesting:

## Slow boot

When I first powered everything on, I waited for a while, but never saw any picture on my monitor. The PC eventually rebooted, multiple times in a row. I took that as a bad sign and turned it off to prevent further damage.

Turns out I should have just waited until it would eventually start up!

It took multiple minutes for the machine to eventually start. I’m not 100% sure what the cause is for that, but I heard in a Linus Tech Tips YouTube video that DDR5 requires time-consuming memory testing when powering up with a fresh memory configuration, so that seems plausible.

In any case, my advice is: be patient when waiting for this machine to start up.

## DDR5 availability as of Late 2021

I originally ordered all components on November 5th 2021. It took a while for the mainboard to become available, but almost everything shipped on November 15th — except for the DDR5 RAM.

Until Late December, I was not able to find any available DDR5 RAM in Switzerland.

The shortage is so pronounced that some YouTubers recommend going with DDR4 mainboards for now, which manufacturers are scrambling to introduce in their lineups. I did really want to squeeze out the last few extra percent in memory-intensive workloads, so I decided to wait.

## Copying the data

Where possible, I like only changing one thing at a time. In this case, I wanted to change the hardware, but keep using my Linux installation as-is.

To copy my Linux installation over, I plugged my old M.2 SSD into the new machine, and then started a live Linux environment, so that neither my old nor my new SSD were in use. My preferred live Linux is grml (current version: 2021.07), which I copied to a USB memory stick and booted the machine from it.

In the grml live Linux environment, I copied the full M.2 SSD contents from old to new:

grml# dd \
if=/dev/disk/by-id/nvme-Force_MP600_<TAB> \
of=/dev/disk/by-id/nvme-WD_BLACK_SN850_2TB_<TAB> \
bs=5M \
status=progress


For some reason, the transfer was super slow. Last time I transferred the contents of a Samsung 960 Pro to a Samsung 970 Pro, it took only 16 minutes. But this time, copying the Force MP600 to a WD Black SN850 took many hours!

Once the data was transferred, I unplugged the old M.2 SSD and booted the system.

The hostname remains the same, and the network addresses are tied to the MAC address of the network card that I moved to the new machine. So, I didn’t have to adjust anything in the new machine and could just boot into my usual environment.

## UEFI settings: enable XMP for 4800 MHz RAM

By default, the memory uses 4000 MHz instead of the 4800 MHz advertised on the box.

I figured it should be safe to try out the XMP option because it is shown as part of ASUS’s “EZ Mode” welcome page in the UEFI setup.

So far, I have not noticed any issues when running the system with XMP enabled.

Update February 2022: I have experienced weird crashes that seem to have gone away after disabling XMP. I’ll leave it disabled for now.

## UEFI settings: fan speed

The Fractal Define case comes with a built-in fan controller.

I recommend not using the Fractal fan controller, as you can’t control it from Linux!

Instead, I have plugged my fans into the mainboard directly.

In the UEFI setup, I have configured all fan speeds to use the “silent” profile.

## ASUS PRIME Z690-A: sensors and fan control

With Linux 5.15.11, some fan speeds and temperature are displayed, but oddly enough it only shows 2 out of the 3 fans I have connected:

% sudo sensors
nct6798-isa-0290
[…]
fan1:                        0 RPM  (min =    0 RPM)
fan2:                      944 RPM  (min =    0 RPM)
fan3:                        0 RPM  (min =    0 RPM)
fan4:                      625 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +35.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +40.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                  -128.0°C    sensor = thermistor
AUXTIN1:                   +24.0°C    sensor = thermistor
AUXTIN2:                   +28.0°C    sensor = thermistor
AUXTIN3:                   +31.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +40.0°C
[…]


Unfortunately, writing to the /sys/class/hwmon/hwmon2/pwm2 file does not seem to change its value, so I don’t think one can control the fans via PWM from Linux (yet?).

I have set all fans to silent in the UEFI setup, which is sufficient to not notice any noise.

## Performance comparison: i9-9900K vs. i9-12900K

After cloning my old disk to the new disk, I took the opportunity to run a few time-intensive tasks from my day-to-day that I could remember.

On both machines, I configured the CPU governor to performance for stable results.

Keep in mind that I’m comparing two unique PC builds as they are (not under controlled and fair conditions), so the results might not necessarily be representative. For example, it seems like the SSD performance in the old machine was heavily degraded due to a incorrect TRIM configuration.

name old new
build Go 1.18beta1 (src/make.bash) ≈45s ≈29s
gokrazy/rsync tests ≈8s ≈5s
gokrazy UEFI test ≈9s ≈8s
distri cryptimage (cold cache) ≈143s ≈18s
gokrazy Linux compilation 215s 109s

As we can see, in all of my tests, the new PC achieves measurably better times! 🎉

## Conclusion

Not only in the benchmarks above, but also subjectively, the new machine feels fast!

Already in the first few days of usage, I notice how time-consuming tasks such as tracking down a Linux kernel issue (requires multiple Linux kernel builds), are a little less terrible thanks to the faster machine :)

The Fractal Define 7 case is great and will likely serve as a good base for upgrades over the next couple of years, just like its predecessor (but perhaps even longer).

As far as I can tell, the machine works well and is compatible with Linux.

## 2022-01-04

### RaumZeitLabor

#### Remote Season Kickoff von "RZL Käfertal, 68309", Staffel 2022

Liebe Fans der Erfolgsshow “RZL Käfertal, 68309”, ein frohes Neues!

Kaum zu glauben, dass die letzte Season schon wieder rum ist und die nächste in den Startlöchern steht.

Da die vergangene Staffel gefühlt viel zu kurz und aus diversen Gründen nur bedingt ereignisreich war, bleibt zu hoffen, dass die Reihe in 2022 wieder volle Fahrt aufnimmt und zurück zu ihrer alten Form findet.

Aus diesem Grund findet am Samstag, den 8. Januar 2022, die Remote Season Kickoff statt, bei der wir uns ab 16 Uhr im Jitsi treffen und zusammen über die kommenden Handlungsstränge reden wollen.

Was nehmt ihr aus der letzten Runde von 2021 mit – welche Plot Twists waren gut, welche nicht? Habt ihr Ideen, was sich in den nächsten (Online-)Event-Episoden abspielen könnte? Mit dem neuen Video-Equipment steht ja auch den Vortragsfolgen nichts mehr im Wege… Wird es ein Wiedersehen mit den beliebten Protagonisten Agenda Aktion oder der Türöffner-Maus geben, oder werden ganz und gar neue eingeführt? Wie geht es mit dem letztjährigen Cliffhanger rund um die Werkstatt weiter?

Ich freue mich auf einen Austausch wie wir zusammen die Story in der Jahresstaffel 2022 beeinflussen werden.

## 2021-12-23

### michael-herbst.com

#### Outlook to 2022

A quick teaser to some workshops I will organise next year.

• 17/18 Feb 2022: Introduction to the Julia programming language (virtual).
In two half-day sessions I will provide a concise overview of the Julia programming language and offer to get some hands-on practice. The selection of exercises and small projects makes the course particularly well-suited for interdisciplinary researchers in the computational sciences, but is free and open to everyone. Course website. Registration link.

• 20-24 Jun 2022: CECAM workshop: Error control in first-principles modelling (Lausanne, Switzerland).
In this workshop, which I organise jointly with Gábor Csányi, Geneviève Dusson, Youssef Marzouk, we plan to bring together mathematicians and simulation scientists to discuss error control and error estimation in first-principles simulations, an aspect which to date has seen too little attention in our opinion. We want to bring together experts on numerical analysis and uncertainty quantification on the one hand and researchers working on electronic-structure and molecular-dynamics methods on the other to identify promising directions of research to make progress in this topic. Website and registration.

• 29-31 Aug 2022: DFTK school: Numerical methods for density-functional-theory simulations (Paris, France).
Antoine Levitt and Eric Cancès and myself will organise an interdisciplinary summer school next year, centred around our joint work on DFTK and numerical developments in density-functional theory (DFT). With the school we want to bridge the divide between simulation practice and fundamental research in electronic-structure methods: It is is intended both for researchers with a background in mathematics and computer science interested to learn the numerics of DFT and physicists or chemists interested in modern software development methodologies and the mathematical background of DFT. Course website. Registration link.

## 2021-12-13

### michael-herbst.com

#### GdR REST Discussion meeting on Machine Learning

Last week on 9th and 10th December 2021 I participated in the Discussion meeting on Machine Learning been organised by the French research group REST, which is centred around theoretical spectroscopy in solids and molecules. While most participants joined remotely I was fortunately able to travel to École Polytechnique in Palaiseau (near Paris). This gave me the opportunity to interact with some of the speakers and local organisers. Since to date I have not yet taken a detailed look at applying machine learning in chemistry and materials science I took the chance to discuss with both practitioners as well as the other on-site speakers during the breaks and the social dinner. Overall this meeting has been extremely helpful and I feel I managed to get a good impression of the challenges and current research in this exciting topic. I am very grateful to the organisers Francesco Sottile and Jack Wetherell for the invitation and I already look forward to my next interaction with the GdR REST.

In my talk I gave an introduction to algorithmic differentiation (AD) approaches and their application in DFTK as well as density-functional theory simulations in general. I motivated our work both from data-driven approaches for the design of novel DFT functionals as well as the computation of properties, sensitivities and uncertainties. Summarised in one sentence the key advantage of getting a code algorithmically differentiable (AD-able) is to be able to automatically compute derivatives of arbitrary output quantities (band gaps, forces, ...) with respect to arbitrary input quantities (pseudo parameters, XC parameters, positions, temperature, ...) within an acceptable computational cost and without the need to code analytical gradients.

AD approaches are not new in the electronic-structure context. However, the successful existing AD-able codes are either centred around simplified settings (e.g. 1D systems) or Gaussian basis sets (thus primarily molecular systems). In contrast our focus in DFTK are solid-state systems. In particular for cases with vanishing band gaps (e.g. metals) this setting is more involved and one needs to be overall a bit more careful in the implementation. Another distinction from previous efforts is that our implementation in DFTK has not been written from scratch just for AD. Effectively the ability to make DFTK AD-able with relatively little effort is a side effect from our flexible design as well as our seamless integration with the composable Julia package ecosystem. To emphasise this let me mention that the largest part of the work I presented upon has been achieved in only 12 weeks by our excellent Google Summer of Code student Niklas Schmitz (Thanks very much Niklas!).

To give a practical demonstration I showed how to use forward-mode algorithmic differentiation to (a) compute polarisabilities, (b) the variation of the dipole moment with respect to changing parameters in the exchange functional and (c) a work-in-progress example using adjoint-mode differentiation. As usual my slides are attached below.

## 2021-12-05

### sECuREs website

#### Fixing the Logitech MX Ergo Trackball mouse buttons

The mouse I use daily for many hours is Logitech’s MX Ergo trackball and I generally consider it the best trackball one can currently buy.

Unfortunately, after only a year or two of usage, the trackball’s mouse buttons no longer function correctly. When clicking and dragging, they won’t hold down the selection reliably.

The mouse buttons first broke in my private trackball, and later also the ones in my work one!

After just buying a new one when the mouse buttons broke the first time, I figured this time I wanted to try and fix the trackball myself.

## Video recording

In this 27 minute video, you can look over my shoulder as I swap out the worn-out Omron mouse buttons with Kailh replacement mouse buttons:

The basic steps are:

1. Unscrew the outside Torx screws.
2. Unscrew the inside Philips screws.
3. Remove the PCB from the case and fix it securely for desoldering.
4. Desolder the switch: heat up all 3 pads as simultaneously as possible (add more solder → more flux!), then gently push down on the pins to make the switch fall out.
5. Cleanly remove all remaining solder, then insert the replacement switch, double-check you aligned it will on the PCB, and solder it.
6. Put everything back together.

## Replacement switches: Kailh GM 8.0

The replacement mouse buttons I’m using are Kailh GM 8.0 from the Kailh Official Store on AliExpress, which are advertised as “ultra high life”. Even if their life span is also only a few years, I bought enough of them to probably replace them another 2 to 3 times per trackball.

The Kailh mouse buttons behave very similarly to the original Omron mouse buttons. The click is very satisfying now, and reminds me of a brand-new Logitech MX Ergo trackball. I wouldn’t call the Kailh ones better than the Omron ones, but maybe others notice a difference?

One interesting side note: I noticed that when wearing noise canceling headphones, it was very hard to tell the worn-out Omron mouse buttons from the Kailh mouse buttons. The difference really is mostly in the sound, not in the feel when pressing the button down!

## Why is the MX Ergo so unreliable?

There is a 1-hour video by Alex Kenis saying that Logitech switched from 5V to 3.3V logic voltages, and this violates the minimum electrical condition for the Omron D2FC-F, which causes oxidation.

Indeed, when I merely opened the switches and cleaned them up with a screw driver, this seemed to help. But, opening everything up is so fiddly that one might as well solder in new switches altogether :)

## 2021-11-28

### sECuREs website

#### MacBook Air M1: the best laptop?

You most likely have heard that Apple switched from Intel CPUs to their own, ARM-based CPUs.

Various early reviews touted the new MacBooks, among the first devices with the ARM-based M1 CPU, as the best computer ever. This got me curious: after years of not using any Macs, would an M1 Mac blow my mind?

In this article, I share my thoughts about the MacBook Air M1, after a year of occasional usage.

## Energy efficiency

The M1 CPU is remarkably energy-efficient. This has two notable effects:

1. The device does not have a fan, and stays absolutely quiet. This is pretty magical, and I now notice my ThinkPad’s fan immediately.
2. The battery lasts many hours, even with demanding use-cases like video conferencing.

When it comes to energy efficiency, Apple sets the bar. All other laptops should be fanless, too! And the battery life really is incredible: taking notes in Google Docs (via WiFi) while at a conference for many hours left me with well over 80% of battery at the end of the day!

I briefly lent the computer to someone and got it back with a VPN client installed. The battery life was considerably shortened by that VPN client and recovered once I uninstalled it. So if you’re not seeing great battery life, maybe a single program is ruining your experience.

The fast wakeup feature that was heavily stressed during the initial introduction (to some ridicule) is actually pretty nice! I now notice having to wait for my ThinkPad to wake up.

Battery life during standby is great, too. Anecdotally, when leaving my ThinkPad lying around, it never survives until I plug it in again. The MacBook survives every single time.

Now, given that Apple controls the entire machine, does that mean they now offer features that other computers cannot offer yet?

My personal bar for this question is whether a computer can be used with my bandwidth-hungry 8K monitor, and the disappointing news is that the MacBook Air M1 cannot drive the 8K monitor with its 7680x4320 pixels resolution (at 60 Hz, using 2 DisplayPort links), not even with an external USB-C dock.

Maybe future hardware generations add support for 8K displays, but for my day-to-day, Apple’s complete control doesn’t improve anything.

## Built-in peripherals

The screen is great! Everything looks sharp, colors are vibrant and brightness is good.

As usual, the touchpad (which Apple calls “trackpad”) is great, much better than any touchpad I have ever used on a PC laptop. Apple trackpads have always had this advantage since I know them, and I don’t know why PC touchpads don’t seem to get any better? 🤔

Apple brought back their scissor mechanism keyboards, which is a very welcome change. I have witnessed so so many problems with the old butterfly mechanism keyboards.

This first MacBook Air M1 model has no MagSafe. Apple added MagSafe in the MacBook Pro M1 in late 2021. I hope they’ll eventually expand MagSafe to all notebooks.

## Peripherals: not enough ports

Staying in peripheral-land, let me first state that this MacBook’s 2 USB-C ports are not enough!

When working on the go, after plugging in power, I can plug in a wired ethernet adapter (wireless can be spotty), but then won’t have any ports left for my ergonomic keyboard and mouse.

For video conferencing, I can plug in power (to ensure I won’t run out of battery), connect a table microphone, but won’t have any ports left for a decent webcam. This is particularly annoying because this MacBook’s built-in webcam is really bad, and the main reason why reviewers don’t give the MacBook a perfect score (example review on YouTube).

So, in practice, you need to carry a USB-C dock, or at least a USB hub, with your laptop when you anticipate possibly needing any peripherals. #donglelife

## Not enough RAM for local software development

Hardware-wise, the biggest pain point for software developers is the small amount of RAM: both the MacBook Air M1 and the MacBook Pro M1 (13") can be configured with up to 16 GB of RAM. Only the newer MacBook Pro M1 14" or 16" (introduced late 2021) support more RAM.

To be clear, 16 GB RAM is enough to do software development in general, but it can quickly become limiting when you deal with larger programs or data sets.

In my ThinkPad, I have 64 GB of RAM, which allows for a lot more VMs, large index data structures, or just plenty of page cache. With the ThinkPad, I don’t have to worry about RAM.

Of course, there are strategies around this. Maybe your projects are large enough to warrant maintaining a remote build cluster, and you can run your test jobs in a staging environment. The MacBook makes for a fine thin client — provided your internet connection is fast and stable.

## Operating System: macOS

I am talking about Operating Systems at a very high level in this section. Many use-cases will work fine, regardless of the Operating System one uses. I can typically get by with a browser and a terminal program.

So, this section isn’t a nuanced or fair review or critique of macOS or anything like that, just a collection of a few random things I found notable while playing with this device :)

My favorite way to install macOS is Internet Recovery. You can install a blank disk in your Mac and start the macOS installer via the internet! The Mac will even remember your WiFi password. The closest thing I know in the PC world is netboot.xyz, and that needs to be installed in your local network first.

Similarly, Apple’s integration when using multiple devices seems pretty good. For example, the Mac will offer to switch to your iPhone’s mobile connection when it loses network connectivity.

But, just like in all other operating systems, there is plenty in macOS to improve.

For example, software updates on the Mac still take 30 minutes (!) or so, which is entirely unacceptable for such a fast device! In particular, Apple seems to be (partially?) using immutable file system snapshots to distribute their software, so I don’t know why distri can install and update so much faster.

Speaking of Operating System shortcomings, I have observed how APFS (the Apple File System) can get into a state in which it cannot be repaired, which I found pretty concerning! Automated and frequent backups of all on-device data is definitely a must.

Slow software updates are annoying, and having little confidence in the file system makes me uneasy, but what’s really a dealbreaker is that my preferred keyboard layout does not work well on macOS: see Appendix A: NEO keyboard layout.

## Linux? 🐧

So given my preference for Linux, could I just use Linux instead?

Unfortunately, while Asahi Linux is making great progress in bringing Linux to the M1 Macs, it seems like it’ll still be many months before I can install a Linux distribution and expect it to just work on the M1 Mac.

Until then, check out the Asahi Linux Progress Report blog posts!

## Intel to M1 architecture transition

Apple developed the Rosetta 2 dynamic binary translator which transparently handles non-M1 programs, and so far it seems to work fine! All the things I tried just worked, and architecture never seemed to play a role during my usage.

## Conclusion

The MacBook Air M1 is indeed impressive! It’s light, silent, fast and the battery life is amazing. If these points are the most important to you in a laptop, and you’re already in the Mac ecosystem, I imagine you’ll be very happy with this laptop.

But is the M1 really so mind-blowing that you should switch to it no matter what? No. As a long-time Linux user who is primarily developing software, I prefer my ThinkPad X1 Extreme with its plentiful peripheral connections and lots of RAM.

I know it’s not an entirely fair comparison: I should probably compare the ThinkPad to the newer MacBook Pro models (not MacBook Air). But I’m not a professional laptop reviewer, I can only speak about these 2 laptops that I found interesting enough to personally try.

## Appendix A: NEO keyboard layout

The macOS implementation of the NEO keyboard layout has a number of significant incompatibilities/limitations: its layer 3 does not work correctly. Layer 3 contains many important common characters, such as / (Mod3 + i, i.e. Caps Lock + i) or ? (Mod3 + s).

I installed the current neo.keylayout file (2019-08-16) as described on the NEO download page.

In order to make / and ? work in Google Docs, I had to enable the additional Karabiner rule “Prevent all layer 3 keys from being treated as option key shortcut” (see also: this GitHub issue)

I encountered the following issues, ordered by severity:

Issue 1: I cannot use Emacs at all! I installed the emacsformacosx.com version (also tried homebrew), but cannot enter keys such as / or ?. Emacs interprets these as M-u instead.

The Karabiner rule “Prevent all layer 3 keys from being treated as option key shortcut” that fixed this issue in Google Docs does not help for Emacs. Removing it from Karabiner changes behavior, but Emacs still recognizes M-i instead of /, so it’s broken with or without the rule.

Issue 2: In the Terminal app, I cannot enable the “Use Option as Meta key” keyboard option, otherwise all layer 3 keys function as meta shortcuts (M-i) instead of key symbols (/).

I commonly use the Meta key to jump around word-wise: Alt+b / Alt+f on a PC. Since I can’t use Option + b / Option + f on a Mac, I need to use Option + arrow keys instead, which works.

Since the Option key does not work as Meta key, I need to press (and release!) the Escape key instead. This is pretty inconvenient in Emacs in a terminal.

Issue 3: In Gmail in Chrome, the search keyboard shortcut (/) is not recognized.

I reported this problem upstream, but there seems to be no solution.

I’m not sure why these programs don’t work well with NEO. I tried BBEdit for comparison, and it had no trouble with (macOS-level) shortcuts such as command + / and option + command + /.

On Linux, the NEO layout works so much better. I’m really not in the mood to continuously fight with my operating system over keyboard input and shortcuts.

## 2021-11-20

### RaumZeitLabor

#### Habemus Hygienekonzept

Knapp ein Jahr nach unserem Einzug, haben wir uns entschieden, das neue RZL für Besuche zu öffnen.

Leider nicht mir einer berauschenden Einweihungsfeier, aber die holen wir nach sobald es geht. Versprochen!

Unser aktuelles Hygienekonzept, das wir gegebenenfalls anpassen werden, findet ihr hier.

Kurzgesagt: 2G, Nachweise erforderlich, Check-In via CWA, Selbsttest vor Besuchen und Maske tragen wird empfohlen.

Bitte kommt nicht, wenn ihr euch krank fühlt oder von einem möglichen Kontakt mit einer covid-positiven Person wisst! Schützt euch und uns!

Bei Veranstaltungen können weitere Regelungen getroffen werden – informiert euch am besten direkt vor eurem Besuch auf der Webseite unter Events oder wendet euch bei Unklarheiten an den Vorstand.

Solltet ihr das erste Mal ins RZL kommen wollen, empfiehlt sich wieder der Dienstagabend mit der „Offenen RaumZeitLaborierung“. Um nicht vor verschlossener Tür zu stehen, solltet ihr euch vorab nach Möglichkeit trotzdem kurz anmelden.

## 2021-11-03

### michael-herbst.com

#### Surrogate models for quantum spin systems based on reduced order modeling

The simulation of quantum spin models is an actively researched field. Albeit rather basic these many-body systems are inherently strongly correlated and as such feature a rich variety of phaenomena including involved patterns of ordering / discordering, topological order or varieties of phase changes. Furthermore these models often provide a good approximation to the low-temperature regime of real physical systems justifying their detailed study. One approach is to consider parametrised quantum spin models as a low-complexity proxy for real systems and use them to understand which parameter values (e.g. which spin coupling strengths) lead to interesting behaviours. From this one can deduce inversely how novel materials ought to be designed in order to probe and study these behaviours experimentally.

In a recent work my mentor Benjamin Stamm and myself teamed up with Stefan Wessel (RWTH physics department) and Matteo Rizzi (Universität Köln, Forschungszentrum Jülich) to work on cheap surrogate models for accelerating the study of such parametrised quantum spin models. Our key assumption is that the Hamiltonian of these models as well as the deduced quantities of interest (e.g. the structure factor) can be decomposed affinely in the parameters. For many standard models this is indeed the case. Exploiting the affine structure of the Hamiltonian our approach constructs a reduced-basis surrogate, which effectively represents the full problem in a basis of the exact solutions at a carefully chosen set of parameter values. As we demonstrate for two examples (a chain of Rydberg atoms as well as a sheet of coupled triangles) the information in relatively small reduced bases, which are orders of magnitude smaller than the dimensionality of the Hilbert space, sufficient information is accumulated by the reduced basis in order to reproduce key quantities of interest over the full parameter domain to an absolute error of 10⁻⁴ or less.

For me this was the first time working with quantum spin models. Even more so I enjoyed this interdisciplinary collaboration and the associated diving into a new subject in the discussions we had. Along the work on this paper we actually identified a number of possibilities for future work. In fact a number of the problems typically encountered when numerically modelling quantum spin models (e.g. due to highly degenerate ground states or issues with the iterative eigensolvers) are closely related to the challenges for modelling difficult quantum-chemical systems.

The full abstract of our paper reads

We present a methodology to investigate phase-diagrams of quantum models based on the principle of the reduced basis method (RBM). The RBM is built from a few ground-state snapshots, i.e., lowest eigenvectors of the full system Hamiltonian computed at well-chosen points in the parameter space of interest. We put forward a greedy-strategy to assemble such small-dimensional basis, i.e., to select where to spend the numerical effort needed for the snapshots. Once the RBM is assembled, physical observables required for mapping out the phase-diagram (e.g., structure factors) can be computed for any parameter value with a modest computational complexity, considerably lower than the one associated to the underlying Hilbert space dimension. We benchmark the method in two test cases, a chain of excited Rydberg atoms and a geometrically frustrated antiferromagnetic two-dimensional lattice model, and illustrate the accuracy of the approach.· In particular, we find that the ground-state manifold can be approximated to sufficient accuracy with a moderate number of basis functions, which increases very mildly when the number of microscopic constituents grows --- in stark contrast to the exponential growth of the Hilbert space needed to describe each of the few snapshots. A combination of the presented RBM approach with other numerical techniques circumventing even the latter big cost, e.g., Tensor Network methods, is a tantalising outlook of this work.

## 2021-11-02

### michael-herbst.com

#### Quantum Chemistry Common Driver and Databases (QCDB) and Quantum Chemistry Engine (QCEngine): Automation and Interoperability among Computational Chemistry Programs

As part of my previous work on the adcc code for computational spectroscopy based on the algebraic-diagrammatic construction (ADC), we also integrated the package with QCEngine. This package aims at integrating different quantum-chemistry codes under a common interface for end users, which is an effort I fully support. Recently the design and structure of QCEngine and the related QCDB packages have been summarised in a publication. Its full abstract reads:

Community efforts in the computational molecular sciences (CMS) are evolving toward modular, open, and interoperable interfaces that work with existing community codes to provide more functionality and composability than could be achieved with a single program. The Quantum Chemistry Common Driver and Databases (QCDB) project provides such capability through an application programming interface (API) that facilitates interoperability across multiple quantum chemistry software packages. In tandem with the Molecular Sciences Software Institute and their Quantum Chemistry Archive ecosystem, the unique functionalities of several CMS programs are integrated, including CFOUR, GAMESS, NWChem, OpenMM, Psi4, Qcore, TeraChem, and Turbomole, to provide common computational functions, i.e., energy, gradient, and Hessian computations as well as molecular properties such as atomic charges and vibrational frequency analysis. Both standard users and power users benefit from adopting these APIs as they lower the language barrier of input styles and enable a standard layout of variables and data. These designs allow end-to-end interoperable programming of complex computations and provide best practices options by default.

## 2021-10-14

### michael-herbst.com

#### A robust and efficient line search for self-consistent field iterations

In an ongoing effort with Antoine Levitt our aim is to develop reliable density-functional theory (DFT) methods for computational materials design. Recently we looked into a strategy to automatically select the damping parameter for the self-consistent field iterations (SCF). Our adaptive damping approach is based on a theoretically sound quadratic model for the DFT energy, which is used to fix the step size (damping) adaptively along the search directions suggested by an underlying algorithm (such as Pulay mixing, Kerker mixing, etc.). Our algorithm is fully automatic, i.e. an a priori damping selection is no longer required. In our work we test our method successfully on a range of challenging systems including supercells, transition-metal alloys or metallic surfaces. Overall our study shows adaptive damping to provide superior robustness over the traditional fixed-damping approach.

As I have reported in previous blog articles and we also discussed in our previous publication on black-box mixing strategies for inhomogeneous systems the main motivation of our work is to design numerical methods, which are parameter-free and automatically self-adapt to the simulated material. In modern simulation scenarios where millions of DFT calculations are required in order to generate training data or screen over large design spaces, robustness and automation are the key requirements. Often it is in fact less the computational time of the individual calculations, which limits overall throughput. Much rather it is the human factor, i.e. the human time required to setup, check and verify computations.

Clearly at the level of millions of calculations computational parameters can no longer be selected manually. Instead elaborate heuristics are employed to select basis set size, k-point sampling, SCF algorithm or the damping parameter. In case a calculation fails heuristics are also employed for automatic restart. However, this approach is far from perfect and even an optimistic 1% failure rate easily equals thousands of calculations, which require human attention. With our work (both the previous paper as well as this one) we want to replace heuristic approaches to parameter selection by algorithms that employ a mixture of mathematical and physical insight to automatically adapt to the simulation at hand. As we demonstrate in this work, such algorithms might be associated with an increased effort compared to the best possible parameter setting, however it also makes calculations overall more robust. Therefore one saves (a) on the repeated effort to find a suitable parameter set by trial and error and (b) reduces the fraction of calculations, which need to be considered by a human. Overall the maximally attainable throughput can therefore be expected to increase from such a robust scheme despite the fact that an individual calculation might be more costly.

In this work in particular we considered the question of choosing the damping parameter. For this our adaptive damping approach is based on constructing an approximate quadratic model for the DFT energy and using this model within a line search procedure. Since this procedure is associated with an additional cost, we only employ it in case the proposed SCF step would either increase the DFT energy or SCF residual Notably our approach introduces no changes to the SCF in case each proposed SCF step by the mixing procedure is already perfect (i.e. energy or residual decreasing). Therefore adaptive damping can be considered a safeguared, which only comes into play if the proposed steps are noisy or erroneous. Adaptive damping is by construction orthogonal to any existing mixing and convergence acceleration technique for DFT methods and in our work we demonstrate it to integrate readily into an Anderson-accelerated SCF for various challenging systems. Overall we managed to increase performance and robustness at only a minor extra cost. The full abstract of our paper reads

We propose a novel adaptive damping algorithm for the self-consistent field (SCF) iterations of Kohn-Sham density-functional theory, using a backtracking line search to automatically adjust the damping in each SCF step. This line search is based on a theoretically sound, accurate and inexpensive model for the energy as a function of the damping parameter. In contrast to usual SCF schemes, the resulting algorithm is fully automatic and does not require the user to select a damping. We successfully apply it to a wide range of challenging systems, including elongated supercells, surfaces and transition-metal alloys.

## 2021-08-28

### michael-herbst.com

#### Q-Chem 5 paper

About two years ago I integrated my open-source ctx library into the Q-Chem quantum-chemistry software suite. Quickly ctx became part of the core stack for managing computational results inside Q-Chem. In particular inside the ccman and adcman modules, which are responsible for most of the coupled-cluster and algebraic-diagrammatic construction methods available in Q-Chem, ctx is widely used.

In a recently published paper by all the Q-Chem authors the developments inside the Q-Chem package leading up the major version 5 of the software are now summarised. The full abstract reads

This article summarizes technical advances contained in the fifth major release of the Q-Chem quantum chemistry program package, covering developments since 2015. A comprehensive library of exchange-correlation functionals, along with a suite of correlated many-body methods, continues to be a hallmark of the Q-Chem software. The many-body methods include novel variants of both coupled-cluster and configuration-interaction approaches along with methods based on the algebraic diagrammatic construction and variational reduced density-matrix methods. Methods highlighted in Q-Chem 5 include a suite of tools for modeling core-level spectroscopy, methods for describing metastable resonances, methods for computing vibronic spectra, the nuclear–electronic orbital method, and several different energy decomposition analysis techniques. High-performance capabilities including multithreaded parallelism and support for calculations on graphics processing units are described. Q-Chem boasts a community of well over 100 active academic developers, and the continuing evolution of the software is supported by an "open teamware" model and an increasingly modular design.

## 2021-08-28

### sECuREs website

#### Silent HP Z440 workstation: replacing noisy fans

Since March 2020, I have been using my work computer at home: an HP Z440 workstation.

When I originally took the machine home, I immediately noticed that it’s quite a bit louder than my other PCs, but only now did I finally decide to investigate what I could do about it.

## Finding all the fans

I first identified all fans, both by opening the chassis and looking around, and by looking at the HP Z440 Maintenance and Service Guide, which contains this description:

Specifically, I identified the following fans:

• “1 Fan”, a 92mm rear fan, sucking air out of the back of the chassis.
• “5 Memory fans”, two 60mm fans in a custom HP plastic enclosure that are positioned directly above the DIMM slots to the left and right of the CPU.
• “6 CPU Heat sink”, a 92mm fan on top of a heat sink
• “11 Rear System Fan”, a 92mm front (!) fan, pulling air into the front of the chassis.
• My aftermarket nVidia GeForce GPU has 3 fans on a massive heat sink.
• The power supply has a fan, too, which I will not touch.

## Memory fans

The Z440 comes with a custom HP plastic enclosure that is put over the CPU cooler, fastened with two clips at opposite ends, and positions two small 60mm fans above the DIMM banks.

This memory fan plastic enclosure is a pain to find anywhere. It looks like HP is no longer producing it.

The enclosure plugs into the mainboard with a custom connector that is directly wired up to the fans, meaning it’s a pain to replace the fans.

Luckily, while shopping around for an enclosure I could modify, I realized that memory fans are only required when installing more than 4 DIMM modules!

My machine “only” has 64 GB of RAM, in 4 DIMM modules, and I don’t intend to upgrade anytime soon, so I just unplugged the whole memory fan enclosure and removed it from the chassis.

The UEFI firmware does not complain about the memory fans missing (contrary to the rear fan!), and this simple change alone makes a noticeable difference in noise levels.

## GPU fans

nVidia GPUs can be run at different “PowerMizer” performance levels:

Many years ago, I ran into lag when using Chrome that went away as soon as I switched my nVidia GPU’s Preferred Mode to “Prefer Maximum Performance” instead of “Auto” or “Adaptive mode”.

It turns out that nowadays, that is no longer a problem, so running at Prefer Maximum Performance is no longer necessary.

Worse, pinning the GPU at the highest Performance Level means that it produces more heat, resulting in the fans having to spin up more often, and run for longer durations.

But, even after switching to Auto, resulting in Adaptive mode being chosen, I noticed that my GPU was stuck at a higher PowerMizer level than I thought it should be.

An easy fix is to limit the GPU to a certain PowerMizer level, and ideally not the lowest level (level 0). For me, one level after that (level 1) seems to result in no slow-down during my typical usage.

I followed this blog post to limit my GPU to PowerMizer level 1, i.e. I added /etc/modprobe.d/nvidia-power-save.conf with the following contents:

options nvidia NVreg_RegistryDwords="OverrideMaxPerf=0x2"


…followed by a rebuild of my initramfs (update-initramfs -u) and a reboot.

This way, the fans don’t typically need to spin up as the GPU stays below its temperature limit.

## Rear and front fans

With the memory fans and GPU fans out of the way, two easy to check fans remain: the rear fan and front fan. These are 92mm in size, the model number is Foxconn PVA092G12S.

I unplugged both of them to see what effect these fans have on the noise level, and the difference was significant!

Unfortunately, unplugging isn’t enough: the UEFI firmware complains on boot when the rear fan is not connected, requiring you to press Enter to boot. Also, the machine seems to get a few degrees Celsius hotter inside without the front and rear fans, so I don’t want to run the machine without these fans for an extended period of time.

I ordered two Noctua NF-A9x14 PWM fans (for about 25 CHF each) to replace the stock front and rear fans.

Unfortunately, HP uses a custom 4-pin fan connector on its Z440 mainboard! Luckily, modifying the connector of the Noctua Low-Noise Adapter cable to fit on the custom 4-pin connector is as simple as using a knife to remove the connector’s guard rails:

## CPU fan

For the CPU fan, HP again chose to use a custom (6-pin) connector.

On the web, I read that the Z440 CPU fan is quite efficient and not worth replacing. This matches my experience, so I kept the standard Z440 CPU cooler.

## Conclusion

I was quite happy to discover that I could just unplug the memory fans, and configure my GPU to make less noise. Together with replacing the front/rear fans with Noctua ones, the machine is much quieter now than before!

One downside of workstation-class hardware is that manufacturers (at least HP) like to build custom parts and solutions. Using their own fan connectors instead of standard connectors is such a pain! I’ll be sure to stick to standard PC hardware :)

## 2021-08-04

### michael-herbst.com

#### JuliaCon BoF discussion session: Building a Chemistry and Materials Science Ecosystem

The second event I co-organised at this year's JuliaCon (see this article for the other) was a Birds of Feather (BoF) discussion session titled Building a Chemistry and Materials Science Ecosystem in Julia. In this session Rachel Kurchin and I wanted to gather the various stakeholders working on Julia codes for chemistry and materials simulations and discuss possible overlaps and plan future joint efforts.

This has been the first time a meeting dedicated to this scientific field has been conducted within the Julia community and so we were quite curious about who would turn up. In the end we had a pretty mixed crowd consisting of Julia users tackling research problems in chemistry and materials as well as plenty of maintainers of various Julia packages related to the field, but also some veteran Julia users joined the discussion. This mix of people provoked a rather rich and lively debate about the perspectives of Julia in this respective field and the 90 minutes which were given to us passed almost in an instance.

A central discussion point within the session was the need for joint interfaces shared amongst the key packages of the ecosystem both to leverage Julia's unique composability between the various packages and to furthermore enhance the interoperability and lead to a good user experience. As many have pointed out during the session, a good first step is the design of an interface for representing the structure of the chemical system or the material to be studied. In particular this would allow to deveop unified approches to share data between packages, setup calculations and plainly compare between different approaches. Additionally annoying aspects such as file parsing, data export, plotting or other post-processing could then be easily implemented once using the general interface and used by everyone in the Julia community. Naturally a time slot of 90 minutes is just about sufficient to get the discussion started and scratch the surface, so the session has not yet yielded anything conclusive. However, following up from the conference the debate has definitely intensified amongst participants and I would not be suprised if some progress will be made.

In case you are interested to participate in these developments or plainly want to get in touch with Julia users and developers from chemistry, molecular or materials science, here are a number of relevant resouces: