Planet NoName e.V.

17. April 2019

Julia lightning talk

Yesterday I gave a short lightning talk at the Inria Devmeetup about Julia. My intention was to show a few of the design aspects of this rather recent programming language and motivated from my experiences in my current PostDoc project at the MATHERIALS team hint at interesting ongoing developments. For this I chose to show some (hopefully) representative code examples in a Jupyter notebook, interactively running the Julia code using an IJulia kernel.

For demonstrating Julia's multiple dispatch paradigm, I hinted how functionality (like the mysquare function in my example) can be easily implemented in a way such that code can be combined with multiple different computational backends. Such backends include distributed array storage for large chunks of data or arrays of static size, where the size information may be used at compile-time to speed up the byte code for small problems. GPU backends are possible as well, but I did not go into this in my presentation. With respect to speed I show some timings from a heat equation example (courtesy Antoine Levitt) comparing a Julia, a python and a C implementation. Last but not least I hinted at the interoperability with python packages and Fortran code and showed a plotting example, where I used Zygote to automatically compute and show the gradient of a function using adjoint-mode automatic differentiation.

The notebook I used for the presentation is both attached below and can be found on github as well. For a rendered version of the notebook you can open it in nbviewer.

Link Licence
Julia: A numerical programming language (Notebook lightning talk) GNU GPL v3
View notebook in nbviewer GNU GPL v3

by Michael F. Herbst at 17. April 2019 09:00:00

12. April 2019

Coulomb Sturmian basis functions in electronic structure theory

After my first Coulomb Sturmian-related talk at the LCT last November, I was approached by Julien Toulouse and asked whether I could give another talk providing a little more insight into Coulomb Sturmians and their potential.

A few weeks ago, on 29th March, it was thus about time for a second Coulomb Sturmian seminar at LCT. In my recent talk I focused more on stressing the differences between Coulomb Sturmians and Gaussian basis sets and motivated by our convergence results at HF level provide some hints, what could be expected from Coulomb Sturmians in the future. Naturally this last part turned out to be rather far-fetched and speculative and little if any bullet-proof evidence for the mentioned aspects can be given at the moment.

I really enjoyed giving this talk in the arising rather informal setting, where many stimulating questions came up in the discussion with plenty of opportunity for further research. As usual the slides are attached below.

Link Licence
Coulomb Sturmian basis functions in electronic structure theory (Slides seminar talk) Creative Commons License

by Michael F. Herbst at 12. April 2019 09:00:00

20. March 2019

Insantity Industries

simple iwd-based wifi

The iNet Wireless Daemon (or iwd for short) is the new superior tool for managing wireless devices on Linux. For the reason why and other details of iwd shall be left to others.

This post shall give a brief introduction into a simple functional iwd-based setup. Make sure you have all alternative service disabled that could interfere with this setup.

Network setup


  • Linux >= 4.20 (at least for things like eduroam or other EAP-wifis)
  • iwd
  • systemd-networkd
  • NetworkManager/Connman/other-GUI-Interface (optional, potentially alternatively to systemd-networkd)


Start iwd. On systemd-based systems this is most likely just a simple systemctl enable iwd && systemctl start iwd. One can verify that iwd runs by issuing iwctl. If it can connect to iwd, you'll get a iwctl-shell.


Create a file /etc/systemd/network/ containing

Name=<name of your wifi device>


and enable systemd-networkd. Now you have iwd bringing your wifi up and systemd-networkd getting you an IP via DHCP on that interface quickly afterwards.

At some point iwd intends to implement DHCP as well, but as of writing this, this is not yet the case and needs to be done by e.g. systemd-networkd.

Remarks to GUIs

If you use NetworkManager, you have to enable the iwd-backend for NetworkManager to use it. In addition to that, double check if you have the right NetworkManager-Version for your iwd version. As iwd has still not reached version 1.0 as of time of writing, the API can still be subject to change if it turns out that things need to be changed to prevent headaches in the future.

Working with it

You can now connect to simple PSK-wifi-networks in the iwctl-shell:

[iwctl] station <devicename> get-networks
[iwctl] station <devicename> connect network-name

iwd will ask you for the password, memorize it for later connections and autoconnect the next time the network appears.

file-based config

If you have more complex wifi-setups, you can place a configuration file in /var/lib/iwd. The files must be named as networkname.protocol. You can find the protocol listed in the output of get-networks in the iwctl-shell. To fill the file, take a look to the network configuration settings in the iwd documentation.

Example: eduroam

To get eduroam running (at least for institutions using TTLS as the EAP-Method), create the file `/var/lib/iwd/eduroam.8021x containing


For the University Heidelberg for example, <certificate.pem> should be /etc/ssl/certs/Deutsche_Telekom_Root_CA_2.pem and <anonymous-identity> should be There are institutions using MSCHAPv2 (which has been broken by now) for the phase2-authentication, those institutions will likely require a different configuration for eduroam according to their specifications.

Final remarks

You might want to take a look at the previous post to deal with race conditions that might be introduced due to iwd being significantly faster than other wifi-solutions on Linux.

by Jonas Große Sundrup at 20. March 2019 23:00:00

17. March 2019

ctx: Key-value datastructure for organised hierarchical storage

As an aside from my plane-wave DFT project at CERMICS, I spent some time in the last few weeks polishing up the ctx context library and publishing it via github. The ctx library provides one approach how to tackle a common challenge in larger scientific simulation codes, namely how to organise simulation data.

Often each substep of a larger simulation code by itself already requires a large number of input data in order to produce its results. To increase the flexibility and modularity of the code base, the data/parameters required for one part of the procedure are best hidden from the respective others. Naturally, complete separation cannot be achieved, since the steps do need to exchange some results or input. The question is therefore where and how to find a good balance between accessing data from other simulation steps and shielding them.

Typically, simulation procedures are inherently hierarchical. E.g. the computation of a particular property of an electronic structure might be achieved by solving linear-response equation, which in turn requires to solve a linear system of equations, e.g. by a contraction-based iterative algorithm, which in turn requires the computation of matrix-vector products. In such an hierarchy of algorithms data only needs to flow up or down, but not sideways. This is to say that, e.g. computing a second property via another linear-response equation, might again be using a similar tree of methods, but does not necessarily need access to all individual details of the first tree. This immediately leads to the approach of a tree-like, hierarchical storage for data and parameters, where a simulation substep may only work on its own subtree, but not on any of its parents or siblings.

This is where the library ctx comes in. It offers a C++ implementation of a tree-like string-to-value mapping, called CtxMap, which allows to store data of arbitrary type. In line with above approach, a key in a CtxMap is a path-like string such as /this/is/a/path/to/a/value. By means of functionality such as views into subtrees or C++ iterators over ranges of keys, navigating and accessing the data from the CtxMap from different parts of the code is greatly facilitated. Since all key objects are stored as std::shared_ptr objects, memory safety as well as good integration into the C++ standard library are assured.

Originally I started working on ctx, when I wanted to connect our molsturm package to the adcman code also developed at the Dreuw group in Heidelberg. In the design of ctx I took strong inspirations from both the PamMap and the GenMap data structures I have been playing with, as well as the libctx library by E. Epifanovsky et. al.. The latter code had similar goals of providing hierarchical storage of data in mind and was used for this purpose inside the Q-Chem quantum chemistry code. From this respect I am very happy that ctx has now become the successor of libctx inside Q-Chem, providing ctx with an application in the context of a larger code base.

For further details and the ctx source code, see the ctx project page on github. ctx can be cited using DOI 10.5281/zenodo.2590706.

by Michael F. Herbst at 17. March 2019 23:00:00

11. March 2019

Insantity Industries

Racefree iwd

The iNet Wireless Daemon (or iwd for short) is the new superior tool for managing wireless devices on Linux. For the reason why and other details of iwd shall be left to others.

iwd has at time of writing two major issues:

  • requiring a (to time of writing) comparatively recent Linux version (<= 4.20) due to upstream bugfixes in its crypto subsystem
  • being so fast that it might bring up the wireless interface before in can be renamed by udev

While the first issue will resolve itself over time, the second one is a little harder to tackle.

The Problem

When a Linux system boots up, all the interfaces get initialized. They are named by linux in order of showing up, typically to eth0, eth1, etc. for wired cards and wlan0, wlan1 etc. for wireless cards. Which card gets which identifier depends solely on the order of getting noticed by Linux and it isn't even guaranteed that wireless cards will be named wlan*, on some systems they also end up with a eth*-indentifier. Therefore every piece of software doing things with interfaces based on their names is at risk of picking the wrong network card at some point.

To tackle this problem, systemd introduced persistent naming. Now your network cards are no longer named eth0, eth1 and wlan0, but enp2s5 or something like that, generated from properties of the network card. Much less beautiful, but persistent and therefore very desirable, fixing the race condition described above – however, doing so by introducing another race condition.

The reason for this is that this rename can only happen when the network interface is down. iwd, however, being much faster than other alternatives (namely wpa_supplicant and everything that's built on it) might already bring the wireless interface up before udev had the chance to rename the interface, which is then no longer possible, leaving it with the (race condition afflicted) kernel name.

The Solution

To solve this issue, we can delay starting iwd until the rename has happened, removing the second race condition from the mix. To do so we first need to find the name of our wireless interface via ip link, yielding something like

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT ...
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: wlp3s2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN ...
    link/ether 8c:7b:83:7d:04:82 brd ff:ff:ff:ff:ff:ff
3: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode ...
    link/ether e4:2d:9a:bb:e4:f4 brd ff:ff:ff:ff:ff:ff

The wireless device here is named wlp3s2, the wired one is enp4s0. In systemd, both cards get a device-unit upon appearance, for the wireless device this is sys-subsystem-net-devices-wlp3s2.device. They can be found in the unit list one gets from systemctl list-units.

To ensure iwd is only started after the wireless card was renamed to wlp3s2, we just need to make the unit starting it wait for said device-unit. This is easiest accomplished by issuing a systemctl edit iwd.service and entering:


systemd will then add those options dynamically to the loaded iwd.service-file when loading it, making it wait for the device wlp3s2 and start after that unit is done setting up, by when the rename has already been performed.

This, of course, requires manual intervention and only works for a single wireless network card, but should suffice for most setups and while not removing the root of the race condition resolves its symptoms with iwd.

by Jonas Große Sundrup at 11. March 2019 23:00:00

10. March 2019

sECuREs Webseite

Winding down my Debian involvement

This post is hard to write, both in the emotional sense but also in the “I would have written a shorter letter, but I didn’t have the time” sense. Hence, please assume the best of intentions when reading it—it is not my intention to make anyone feel bad about their contributions, but rather to provide some insight into why my frustration level ultimately exceeded the threshold.

Debian has been in my life for well over 10 years at this point.

A few weeks ago, I have visited some old friends at the Zürich Debian meetup after a multi-year period of absence. On my bike ride home, it occurred to me that the topics of our discussions had remarkable overlap with my last visit. We had a discussion about the merits of systemd, which took a detour to respect in open source communities, returned to processes in Debian and eventually culminated in democracies and their theoretical/practical failings. Admittedly, that last one might be a Swiss thing.

I say this not to knock on the Debian meetup, but because it prompted me to reflect on what feelings Debian is invoking lately and whether it’s still a good fit for me.

So I’m finally making a decision that I should have made a long time ago: I am winding down my involvement in Debian to a minimum.

What does this mean?

Over the coming weeks, I will:

  • transition packages to be team-maintained where it makes sense
  • remove myself from the Uploaders field on packages with other maintainers
  • orphan packages where I am the sole maintainer

I will try to keep up best-effort maintenance of the service and the service, but any help would be much appreciated.

For all intents and purposes, please treat me as permanently on vacation. I will try to be around for administrative issues (e.g. permission transfers) and questions addressed directly to me, permitted they are easy enough to answer.


When I joined Debian, I was still studying, i.e. I had luxurious amounts of spare time. Now, over 5 years of full time work later, my day job taught me a lot, both about what works in large software engineering projects and how I personally like my computer systems. I am very conscious of how I spend the little spare time that I have these days.

The following sections each deal with what I consider a major pain point, in no particular order. Some of them influence each other—for example, if changes worked better, we could have a chance at transitioning packages to be more easily machine readable.

Change process in Debian

The last few years, my current team at work conducted various smaller and larger refactorings across the entire code base (touching thousands of projects), so we have learnt a lot of valuable lessons about how to effectively do these changes. It irks me that Debian works almost the opposite way in every regard. I appreciate that every organization is different, but I think a lot of my points do actually apply to Debian.

In Debian, packages are nudged in the right direction by a document called the Debian Policy, or its programmatic embodiment, lintian.

While it is great to have a lint tool (for quick, local/offline feedback), it is even better to not require a lint tool at all. The team conducting the change (e.g. the C++ team introduces a new hardening flag for all packages) should be able to do their work transparent to me.

Instead, currently, all packages become lint-unclean, all maintainers need to read up on what the new thing is, how it might break, whether/how it affects them, manually run some tests, and finally decide to opt in. This causes a lot of overhead and manually executed mechanical changes across packages.

Notably, the cost of each change is distributed onto the package maintainers in the Debian model. At work, we have found that the opposite works better: if the team behind the change is put in power to do the change for as many users as possible, they can be significantly more efficient at it, which reduces the total cost and time a lot. Of course, exceptions (e.g. a large project abusing a language feature) should still be taken care of by the respective owners, but the important bit is that the default should be the other way around.

Debian is lacking tooling for large changes: it is hard to programmatically deal with packages and repositories (see the section below). The closest to “sending out a change for review” is to open a bug report with an attached patch. I thought the workflow for accepting a change from a bug report was too complicated and started mergebot, but only Guido ever signaled interest in the project.

Culturally, reviews and reactions are slow. There are no deadlines. I literally sometimes get emails notifying me that a patch I sent out a few years ago (!!) is now merged. This turns projects from a small number of weeks into many years, which is a huge demotivator for me.

Interestingly enough, you can see artifacts of the slow online activity manifest itself in the offline culture as well: I don’t want to be discussing systemd’s merits 10 years after I first heard about it.

Lastly, changes can easily be slowed down significantly by holdouts who refuse to collaborate. My canonical example for this is rsync, whose maintainer refused my patches to make the package use debhelper purely out of personal preference.

Granting so much personal freedom to individual maintainers prevents us as a project from raising the abstraction level for building Debian packages, which in turn makes tooling harder.

How would things look like in a better world?

  1. As a project, we should strive towards more unification. Uniformity still does not rule out experimentation, it just changes the trade-off from easier experimentation and harder automation to harder experimentation and easier automation.
  2. Our culture needs to shift from “this package is my domain, how dare you touch it” to a shared sense of ownership, where anyone in the project can easily contribute (reviewed) changes without necessarily even involving individual maintainers.

To learn more about how successful large changes can look like, I recommend my colleague Hyrum Wright’s talk “Large-Scale Changes at Google: Lessons Learned From 5 Yrs of Mass Migrations”.

Fragmented workflow and infrastructure

Debian generally seems to prefer decentralized approaches over centralized ones. For example, individual packages are maintained in separate repositories (as opposed to in one repository), each repository can use any SCM (git and svn are common ones) or no SCM at all, and each repository can be hosted on a different site. Of course, what you do in such a repository also varies subtly from team to team, and even within teams.

In practice, non-standard hosting options are used rarely enough to not justify their cost, but frequently enough to be a huge pain when trying to automate changes to packages. Instead of using GitLab’s API to create a merge request, you have to design an entirely different, more complex system, which deals with intermittently (or permanently!) unreachable repositories and abstracts away differences in patch delivery (bug reports, merge requests, pull requests, email, …).

Wildly diverging workflows is not just a temporary problem either. I participated in long discussions about different git workflows during DebConf 13, and gather that there were similar discussions in the meantime.

Personally, I cannot keep enough details of the different workflows in my head. Every time I touch a package that works differently than mine, it frustrates me immensely to re-learn aspects of my day-to-day.

After noticing workflow fragmentation in the Go packaging team (which I started), I tried fixing this with the workflow changes proposal, but did not succeed in implementing it. The lack of effective automation and slow pace of changes in the surrounding tooling despite my willingness to contribute time and energy killed any motivation I had.

Old infrastructure: package uploads

When you want to make a package available in Debian, you upload GPG-signed files via anonymous FTP. There are several batch jobs (the queue daemon, unchecked, dinstall, possibly others) which run on fixed schedules (e.g. dinstall runs at 01:52 UTC, 07:52 UTC, 13:52 UTC and 19:52 UTC).

Depending on timing, I estimated that you might wait for over 7 hours (!!) before your package is actually installable.

What’s worse for me is that feedback to your upload is asynchronous. I like to do one thing, be done with it, move to the next thing. The current setup requires a many-minute wait and costly task switch for no good technical reason. You might think a few minutes aren’t a big deal, but when all the time I can spend on Debian per day is measured in minutes, this makes a huge difference in perceived productivity and fun.

The last communication I can find about speeding up this process is ganneff’s post from 2008.

How would things look like in a better world?

  1. Anonymous FTP would be replaced by a web service which ingests my package and returns an authoritative accept or reject decision in its response.
  2. For accepted packages, there would be a status page displaying the build status and when the package will be available via the mirror network.
  3. Packages should be available within a few minutes after the build completed.

Old infrastructure: bug tracker

I dread interacting with the Debian bug tracker. debbugs is a piece of software (from 1994) which is only used by Debian and the GNU project these days.

Debbugs processes emails, which is to say it is asynchronous and cumbersome to deal with. Despite running on the fastest machines we have available in Debian (or so I was told when the subject last came up), its web interface loads very slowly.

Notably, the web interface at is read-only. Setting up a working email setup for reportbug(1) or manually dealing with attachments is a rather big hurdle.

For reasons I don’t understand, every interaction with debbugs results in many different email threads.

Aside from the technical implementation, I also can never remember the different ways that Debian uses pseudo-packages for bugs and processes. I need them rarely enough to establish a mental model of how they are set up, or working memory of how they are used, but frequently enough to be annoyed by this.

How would things look like in a better world?

  1. Debian would switch from a custom bug tracker to a (any) well-established one.
  2. Debian would offer automation around processes. It is great to have a paper-trail and artifacts of the process in the form of a bug report, but the primary interface should be more convenient (e.g. a web form).

Old infrastructure: mailing list archives

It baffles me that in 2019, we still don’t have a conveniently browsable threaded archive of mailing list discussions. Email and threading is more widely used in Debian than anywhere else, so this is somewhat ironic. Gmane used to paper over this issue, but Gmane’s availability over the last few years has been spotty, to say the least (it is down as I write this).

I tried to contribute a threaded list archive, but our listmasters didn’t seem to care or want to support the project.

Debian is hard to machine-read

While it is obviously possible to deal with Debian packages programmatically, the experience is far from pleasant. Everything seems slow and cumbersome. I have picked just 3 quick examples to illustrate my point.

debiman needs help from piuparts in analyzing the alternatives mechanism of each package to display the manpages of e.g. psql(1). This is because maintainer scripts modify the alternatives database by calling shell scripts. Without actually installing a package, you cannot know which changes it does to the alternatives database.

pk4 needs to maintain its own cache to look up package metadata based on the package name. Other tools parse the apt database from scratch on every invocation. A proper database format, or at least a binary interchange format, would go a long way.

Debian Code Search wants to ingest new packages as quickly as possible. There used to be a fedmsg instance for Debian, but it no longer seems to exist. It is unclear where to get notifications from for new packages, and where best to fetch those packages.

Complicated build stack

See my “Debian package build tools” post. It really bugs me that the sprawl of tools is not seen as a problem by others.

Developer experience pretty painful

Most of the points discussed so far deal with the experience in developing Debian, but as I recently described in my post “Debugging experience in Debian”, the experience when developing using Debian leaves a lot to be desired, too.

I have more ideas

At this point, the article is getting pretty long, and hopefully you got a rough idea of my motivation.

While I described a number of specific shortcomings above, the final nail in the coffin is actually the lack of a positive outlook. I have more ideas that seem really compelling to me, but, based on how my previous projects have been going, I don’t think I can make any of these ideas happen within the Debian project.

I intend to publish a few more posts about specific ideas for improving operating systems here. Stay tuned.

Lastly, I hope this post inspires someone, ideally a group of people, to improve the developer experience within Debian.

10. March 2019 20:43:00

15. February 2019

sECuREs Webseite

Debugging experience in Debian

Recently, a user reported that they don’t see window titles in i3 when running i3 on a Raspberry Pi with Debian.

I copied the latest Raspberry Pi Debian image onto an SD card, booted it, and was able to reproduce the issue.

Conceptually, at this point, I should be able to install and start gdb, set a break point and step through the code.

Enabling debug symbols in Debian

Debian, by default, strips debug symbols when building packages to conserve disk space and network bandwidth. The motivation is very reasonable: most users will never need the debug symbols.

Unfortunately, obtaining debug symbols when you do need them is unreasonably hard.

We begin by configuring an additional apt repository which contains automatically generated debug packages:

raspi# cat >>/etc/apt/sources.list.d/debug.list <<'EOT'
deb buster-debug main contrib non-free
raspi# apt update

Notably, not all Debian packages have debug packages. As the DebugPackage Debian Wiki page explains, debhelper/9.20151219 started generating debug packages (ending in -dbgsym) automatically. Packages which have not been updated might come with their own debug packages (ending in -dbg) or might not preserve debug symbols at all!

Now that we can install debug packages, how do we know which ones we need?

Finding debug symbol packages in Debian

For debugging i3, we obviously need at least the i3-dbgsym package, but i3 uses a number of other libraries through whose code we may need to step.

The debian-goodies package ships a tool called find-dbgsym-packages which prints the required packages to debug an executable, core dump or running process:

raspi# apt install debian-goodies
raspi# apt install $(find-dbgsym-packages $(which i3))

Now we should have symbol names and line number information available in gdb. But for effectively stepping through the program, access to the source code is required.

Obtaining source code in Debian

Naively, one would assume that apt source should be sufficient for obtaining the source code of any Debian package. However, apt source defaults to the package candidate version, not the version you have installed on your system.

I have addressed this issue with the pk4 tool, which defaults to the installed version.

Before we can extract any sources, we need to configure yet another apt repository:

raspi# cat >>/etc/apt/sources.list.d/source.list <<'EOT'
deb-src buster main contrib non-free
raspi# apt update

Regardless of whether you use apt source or pk4, one remaining problem is the directory mismatch: the debug symbols contain a certain path, and that path is typically not where you extracted your sources to. While debugging, you will need to tell gdb about the location of the sources. This is tricky when you debug a call across different source packages:

(gdb) pwd
Working directory /usr/src/i3.
(gdb) list main
229     * the main loop. */
230     ev_unref(main_loop);
231   }
232 }
234 int main(int argc, char *argv[]) {
235  /* Keep a symbol pointing to the I3_VERSION string constant so that
236   * we have it in gdb backtraces. */
237  static const char *_i3_version __attribute__((used)) = I3_VERSION;
238  char *override_configpath = NULL;
(gdb) list xcb_connect
484	../../src/xcb_util.c: No such file or directory.

See Specifying Source Directories in the gdb manual for the dir command which allows you to add multiple directories to the source path. This is pretty tedious, though, and does not work for all programs.

Positive example: Fedora

While Fedora conceptually shares all the same steps, the experience on Fedora is so much better: when you run gdb /usr/bin/i3, it will tell you what the next step is:

# gdb /usr/bin/i3
Reading symbols from /usr/bin/i3...(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install i3-4.16-1.fc28.x86_64

Watch what happens when we run the suggested command:

# dnf debuginfo-install i3-4.16-1.fc28.x86_64
enabling updates-debuginfo repository
enabling fedora-debuginfo repository
  i3-debuginfo.x86_64 4.16-1.fc28
  i3-debugsource.x86_64 4.16-1.fc28

A single command understood our intent, enabled the required repositories and installed the required packages, both for debug symbols and source code (stored in e.g. /usr/src/debug/i3-4.16-1.fc28.x86_64). Unfortunately, gdb doesn’t seem to locate the sources, which seems like a bug to me.

One downside of Fedora’s approach is that gdb will only print all required dependencies once you actually run the program, so you may need to run multiple dnf commands.

In an ideal world

Ideally, none of the manual steps described above would be necessary. It seems absurd to me that so much knowledge is required to efficiently debug programs in Debian. Case in point: I only learnt about find-dbgsym-packages a few days ago when talking to one of its contributors.

Installing gdb should be all that a user needs to do. Debug symbols and sources can be transparently provided through a lazy-loading FUSE file system. If our build/packaging infrastructure assured predictable paths and automated debug symbol extraction, we could have transparent, quick and reliable debugging of all programs within Debian.

NixOS’s dwarffs is an implementation of this idea:


While I agree with the removal of debug symbols as a general optimization, I think every Linux distribution should strive to provide an entirely transparent debugging experience: you should not even have to know that debug symbols are not present by default. Debian really falls short in this regard.

Getting Debian to a fully transparent debugging experience requires a lot of technical work and a lot of social convincing. In my experience, programmatically working with the Debian archive and packages is tricky, and ensuring that all packages in a Debian release have debug packages (let alone predictable paths) seems entirely unachievable due to the fragmentation of packaging infrastructure and holdouts blocking any progress.

My go-to example is rsync’s debian/rules, which intentionally (!) still has not adopted debhelper. It is not a surprise that there are no debug symbols for rsync in Debian.

15. February 2019 12:11:01

11. February 2019

Insantity Industries

Self-hosting Threema Safe

What is Threema Safe?

Threema Safe is the messenger Threema's solution to back up your encryption ID and other things so they don't get lost once either you decide to smash your phone against a nearby anvil or the phone itself decides to die unexpectedly. From this backup you can restore the aforementioned things into a new Threema-installation.

Threema Safe can also be selfhosted.

How does it work

Threema Safe regularly backs things up into a WebDAV-directory of choice. By default the Threema-Server is configured for this, but any WebDAV-server, from a standard webserver to a Nexcloud, can be configured as a backup target.

Server-side setup

As a standard webserver can be used as a backup target, we will show the necessary configuration for an nginx to act as a valid backup server for Threema Safe.

The main configuration file of the server, /etc/nginx.conf, reads

user www-data;
worker_processes 4;
pid /run/;

http {

    # standard stuff
    include /etc/nginx/mime.types;
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    ssl_dhparam /path/to/dhparams;

    server_name some.arbitrary.domain;

    server {

        listen 443 ssl default_server;      # ipv4
        listen [::]:443 ssl default_server; # ipv6

        location / {
            root    /path/to/webdav/contents;
            client_body_temp_path /tmp;
            dav_methods     PUT DELETE MKCOL COPY MOVE;

            create_full_put_path  on;
            dav_access    user:rw;
            autoindex    on;

            # if desired, a htpasswd-file can be provided to require
            # authentication in front of the nginx
            auth_basic "you shall not pass";
            auth_basic_user_file /path/to/htpasswd;

In the root of the WebDAV-folder create a folder (optional) with two things present (not optional):

  • a file config
  • a directory backups that must be writable (and likely readable for restore)

The config-file reads:

   "maxBackupBytes": 524288,
   "retentionDays": 180

This concludes the server side backup. If you want to verify that the WebDAV works correctly, you can browse it with your browser to see if it responds correctly.

# tree /path/to/webdav/contents
└── threema_safe
    ├── backups
    │   └── b5bb9d8014a0f9b1d only after first backup 12f4850b878ae4944c
    └── config

Client side setup

In Threema, go to the menu and select "My backups". Activate Threema Safe and tap on "Expert Settings", there you can enter your custom server. In our example, this would be https://some.arbitrary.domain/threema_safe. If you configured your nginx to require authentication, specify your access credentials. Then tap "Backup now" and you should be good to go!

by Jonas Große Sundrup at 11. February 2019 23:00:00

05. February 2019

sECuREs Webseite

Looking for a new Raspberry Pi image maintainer

This is taken care of: Gunnar Wolf has taken on maintenance of the Raspberry Pi image. Thank you!

(Cross-posting this message I sent to pkg-raspi-maintainers for broader visibility.)

I started building Raspberry Pi images because I thought there should be an easy, official way to install Debian on the Raspberry Pi.

I still believe that, but I’m not actually using Debian on any of my Raspberry Pis anymore¹, so my personal motivation to do any work on the images is gone.

On top of that, I realize that my commitments exceed my spare time capacity, so I need to get rid of responsibilities.

Therefore, I’m looking for someone to take up maintainership of the Raspberry Pi images. Numerous people have reached out to me with thank you notes and questions, so I think the user interest is there. Also, I’ll be happy to answer any questions that you might have and that I can easily answer. Please reply here (or in private) if you’re interested.

If I can’t find someone within the next 7 days, I’ll put up an announcement message in the raspi3-image-spec README, wiki page, and my blog posts, stating that the image is unmaintained and looking for a new maintainer.

Thanks for your understanding,

① just in case you’re curious, I’m now running cross-compiled Go programs directly under a Linux kernel and minimal userland, see

05. February 2019 08:42:13

TurboPFor: an analysis


I have recently been looking into speeding up Debian Code Search. As a quick reminder, search engines answer queries by consulting an inverted index: a map from term to documents containing that term (called a “posting list”). See the Debian Code Search Bachelor Thesis (PDF) for a lot more details.

Currently, Debian Code Search does not store positional information in its index, i.e. the index can only reveal that a certain trigram is present in a document, not where or how often.

From analyzing Debian Code Search queries, I knew that identifier queries (70%) massively outnumber regular expression queries (30%). When processing identifier queries, storing positional information in the index enables a significant optimization: instead of identifying the possibly-matching documents and having to read them all, we can determine matches from querying the index alone, no document reads required.

This moves the bottleneck: having to read all possibly-matching documents requires a lot of expensive random I/O, whereas having to decode long posting lists requires a lot of cheap sequential I/O.

Of course, storing positions comes with a downside: the index is larger, and a larger index takes more time to decode when querying.

Hence, I have been looking at various posting list compression/decoding techniques, to figure out whether we could switch to a technique which would retain (or improve upon!) current performance despite much longer posting lists and produce a small enough index to fit on our current hardware.


I started looking into this space because of Daniel Lemire’s Stream VByte post. As usual, Daniel’s work is well presented, easily digestible and accompanied by not just one, but multiple implementations.

I also looked for scientific papers to learn about the state of the art and classes of different approaches in general. The best I could find is Compression, SIMD, and Postings Lists. If you don’t have access to the paper, I hear that Sci-Hub is helpful.

The paper is from 2014, and doesn’t include all algorithms. If you know of a better paper, please let me know and I’ll include it here.

Eventually, I stumbled upon an algorithm/implementation called TurboPFor, which the rest of the article tries to shine some light on.


If you’re wondering: PFor stands for Patched Frame Of Reference and describes a family of algorithms. The principle is explained e.g. in SIMD Compression and the Intersection of Sorted Integers (PDF).

The TurboPFor project’s README file claims that TurboPFor256 compresses with a rate of 5.04 bits per integer, and can decode with 9400 MB/s on a single thread of an Intel i7-6700 CPU.

For Debian Code Search, we use unsigned integers of 32 bit (uint32), which TurboPFor will compress into as few bits as required.

Dividing Debian Code Search’s file sizes by the total number of integers, I get similar values, at least for the docid index section:

  • 5.49 bits per integer for the docid index section
  • 11.09 bits per integer for the positions index section

I can confirm the order of magnitude of the decoding speed, too. My benchmark calls TurboPFor from Go via cgo, which introduces some overhead. To exclude disk speed as a factor, data comes from the page cache. The benchmark sequentially decodes all posting lists in the specified index, using as many threads as the machine has cores¹:

  • ≈1400 MB/s on a 1.1 GiB docid index section
  • ≈4126 MB/s on a 15.0 GiB position index section

I think the numbers differ because the position index section contains larger integers (requiring more bits). I repeated both benchmarks, capped to 1 GiB, and decoding speeds still differed, so it is not just the size of the index.

Compared to Streaming VByte, a TurboPFor256 index comes in at just over half the size, while still reaching 83% of Streaming VByte’s decoding speed. This seems like a good trade-off for my use-case, so I decided to have a closer look at how TurboPFor works.

① See cmd/gp4-verify/verify.go run on an Intel i9-9900K.


To confirm my understanding of the details of the format, I implemented a pure-Go TurboPFor256 decoder. Note that it is intentionally not optimized as its main goal is to use simple code to teach the TurboPFor256 on-disk format.

If you’re looking to use TurboPFor from Go, I recommend using cgo. cgo’s function call overhead is about 51ns as of Go 1.8, which will easily be offset by TurboPFor’s carefully optimized, vectorized (SSE/AVX) code.

With that caveat out of the way, you can find my teaching implementation at

I verified that it produces the same results as TurboPFor’s p4ndec256v32 function for all posting lists in the Debian Code Search index.

On-disk format

Note that TurboPFor does not fully define an on-disk format on its own. When encoding, it turns a list of integers into a byte stream:

size_t p4nenc256v32(uint32_t *in, size_t n, unsigned char *out);

When decoding, it decodes the byte stream into an array of integers, but needs to know the number of integers in advance:

size_t p4ndec256v32(unsigned char *in, size_t n, uint32_t *out);

Hence, you’ll need to keep track of the number of integers and length of the generated byte streams separately. When I talk about on-disk format, I’m referring to the byte stream which TurboPFor returns.

The TurboPFor256 format uses blocks of 256 integers each, followed by a trailing block — if required — which can contain fewer than 256 integers:

SIMD bitpacking is used for all blocks but the trailing block (which uses regular bitpacking). This is not merely an implementation detail for decoding: the on-disk structure is different for blocks which can be SIMD-decoded.

Each block starts with a 2 bit header, specifying the type of the block:

Each block type is explained in more detail in the following sections.

Note that none of the block types store the number of elements: you will always need to know how many integers you need to decode. Also, you need to know in advance how many bytes you need to feed to TurboPFor, so you will need some sort of container format.

Further, TurboPFor automatically choses the best block type for each block.

Constant block

A constant block (all integers of the block have the same value) consists of a single value of a specified bit width ≤ 32. This value will be stored in each output element for the block. E.g., after calling decode(input, 3, output) with input being the constant block depicted below, output is {0xB8912636, 0xB8912636, 0xB8912636}.

The example shows the maximum number of bytes (5). Smaller integers will use fewer bytes: e.g. an integer which can be represented in 3 bits will only use 2 bytes.

Bitpacking block

A bitpacking block specifies a bit width ≤ 32, followed by a stream of bits. Each value starts at the Least Significant Bit (LSB), i.e. the 3-bit values 0 (000b) and 5 (101b) are encoded as 101000b.

Bitpacking with exceptions (bitmap) block

The constant and bitpacking block types work well for integers which don’t exceed a certain width, e.g. for a series of integers of width ≤ 5 bits.

For a series of integers where only a few values exceed an otherwise common width (say, two values require 7 bits, the rest requires 5 bits), it makes sense to cut the integers into two parts: value and exception.

In the example below, decoding the third integer out2 (000b) requires combination with exception ex0 (10110b), resulting in 10110000b.

The number of exceptions can be determined by summing the 1 bits in the bitmap using the popcount instruction.

Bitpacking with exceptions (variable byte)

When the exceptions are not uniform enough, it makes sense to switch from bitpacking to a variable byte encoding:

Decoding: variable byte

The variable byte encoding used by the TurboPFor format is similar to the one used by SQLite, which is described, alongside other common variable byte encodings, at

Instead of using individual bits for dispatching, this format classifies the first byte (b[0]) into ranges:

  • [0—176]: the value is b[0]
  • [177—240]: a 14 bit value is in b[0] (6 high bits) and b[1] (8 low bits)
  • [241—248]: a 19 bit value is in b[0] (3 high bits), b[1] and b[2] (16 low bits)
  • [249—255]: a 32 bit value is in b[1], b[2], b[3] and possibly b[4]

Here is the space usage of different values:

  • [0—176] are stored in 1 byte (as-is)
  • [177—16560] are stored in 2 bytes, with the highest 6 bits added to 177
  • [16561—540848] are stored in 3 bytes, with the highest 3 bits added to 241
  • [540849—16777215] are stored in 4 bytes, with 0 added to 249
  • [16777216—4294967295] are stored in 5 bytes, with 1 added to 249

An overflow marker will be used to signal that encoding the values would be less space-efficient than simply copying them (e.g. if all values require 5 bytes).

This format is very space-efficient: it packs 0-176 into a single byte, as opposed to 0-128 (most others). At the same time, it can be decoded very quickly, as only the first byte needs to be compared to decode a value (similar to PrefixVarint).

Decoding: bitpacking

Regular bitpacking

In regular (non-SIMD) bitpacking, integers are stored on disk one after the other, padded to a full byte, as a byte is the smallest addressable unit when reading data from disk. For example, if you bitpack only one 3 bit int, you will end up with 5 bits of padding.

SIMD bitpacking (256v32)

SIMD bitpacking works like regular bitpacking, but processes 8 uint32 little-endian values at the same time, leveraging the AVX instruction set. The following illustration shows the order in which 3-bit integers are decoded from disk:

In Practice

For a Debian Code Search index, 85% of posting lists are short enough to only consist of a trailing block, i.e. no SIMD instructions can be used for decoding.

The distribution of block types looks as follows:

  • 72% bitpacking with exceptions (bitmap)
  • 19% bitpacking with exceptions (variable byte)
  • 5% constant
  • 4% bitpacking

Constant blocks are mostly used for posting lists with just one entry.


The TurboPFor on-disk format is very flexible: with its 4 different kinds of blocks, chances are high that a very efficient encoding will be used for most integer series.

Of course, the flip side of covering so many cases is complexity: the format and implementation take quite a bit of time to understand — hopefully this article helps a little! For environments where the C TurboPFor implementation cannot be used, smaller algorithms might be simpler to implement.

That said, if you can use the TurboPFor implementation, you will benefit from a highly optimized SIMD code base, which will most likely be an improvement over what you’re currently using.

05. February 2019 08:18:07

04. February 2019

sECuREs Webseite


I have heard a number of times that sbuild is too hard to get started with, and hence people don’t use it.

To reduce hurdles from using/contributing to Debian, I wanted to make sbuild easier to set up.

sbuild ≥ 0.74.0 provides a Debian package called sbuild-debian-developer-setup. Once installed, run the sbuild-debian-developer-setup(1) command to create a chroot suitable for building packages for Debian unstable.

On a system without any sbuild/schroot bits installed, a transcript of the full setup looks like this:

% sudo apt install -t unstable sbuild-debian-developer-setup
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libsbuild-perl sbuild schroot
Suggested packages:
  deborphan btrfs-tools aufs-tools | unionfs-fuse qemu-user-static
Recommended packages:
  exim4 | mail-transport-agent autopkgtest
The following NEW packages will be installed:
  libsbuild-perl sbuild sbuild-debian-developer-setup schroot
0 upgraded, 4 newly installed, 0 to remove and 1454 not upgraded.
Need to get 1.106 kB of archives.
After this operation, 3.556 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://localhost:3142/ unstable/main amd64 libsbuild-perl all 0.74.0-1 [129 kB]
Get:2 http://localhost:3142/ unstable/main amd64 sbuild all 0.74.0-1 [142 kB]
Get:3 http://localhost:3142/ testing/main amd64 schroot amd64 1.6.10-4 [772 kB]
Get:4 http://localhost:3142/ unstable/main amd64 sbuild-debian-developer-setup all 0.74.0-1 [62,6 kB]
Fetched 1.106 kB in 0s (5.036 kB/s)
Selecting previously unselected package libsbuild-perl.
(Reading database ... 276684 files and directories currently installed.)
Preparing to unpack .../libsbuild-perl_0.74.0-1_all.deb ...
Unpacking libsbuild-perl (0.74.0-1) ...
Selecting previously unselected package sbuild.
Preparing to unpack .../sbuild_0.74.0-1_all.deb ...
Unpacking sbuild (0.74.0-1) ...
Selecting previously unselected package schroot.
Preparing to unpack .../schroot_1.6.10-4_amd64.deb ...
Unpacking schroot (1.6.10-4) ...
Selecting previously unselected package sbuild-debian-developer-setup.
Preparing to unpack .../sbuild-debian-developer-setup_0.74.0-1_all.deb ...
Unpacking sbuild-debian-developer-setup (0.74.0-1) ...
Processing triggers for systemd (236-1) ...
Setting up schroot (1.6.10-4) ...
Created symlink /etc/systemd/system/ → /lib/systemd/system/schroot.service.
Setting up libsbuild-perl (0.74.0-1) ...
Processing triggers for man-db ( ...
Setting up sbuild (0.74.0-1) ...
Setting up sbuild-debian-developer-setup (0.74.0-1) ...
Processing triggers for systemd (236-1) ...

% sudo sbuild-debian-developer-setup
The user `michael' is already a member of `sbuild'.
I: SUITE: unstable
I: TARGET: /srv/chroot/unstable-amd64-sbuild
I: MIRROR: http://localhost:3142/
I: Running debootstrap --arch=amd64 --variant=buildd --verbose --include=fakeroot,build-essential,eatmydata --components=main --resolve-deps unstable /srv/chroot/unstable-amd64-sbuild http://localhost:3142/
I: Retrieving InRelease 
I: Checking Release signature
I: Valid Release signature (key id 126C0D24BD8A2942CC7DF8AC7638D0442B90D010)
I: Retrieving Packages 
I: Validating Packages 
I: Found packages in base already in required: apt 
I: Resolving dependencies of required packages...
I: Successfully set up unstable chroot.
I: Run "sbuild-adduser" to add new sbuild users.
ln -s /usr/share/doc/sbuild/examples/sbuild-update-all /etc/cron.daily/sbuild-debian-developer-setup-update-all
Now run `newgrp sbuild', or log out and log in again.

% newgrp sbuild

% sbuild -d unstable hello
sbuild (Debian sbuild) 0.74.0 (14 Mar 2018) on x1

| hello (amd64)                                Mon, 19 Mar 2018 07:46:14 +0000 |

Package: hello
Distribution: unstable
Machine Architecture: amd64
Host Architecture: amd64
Build Architecture: amd64
Build Type: binary

I hope you’ll find this useful.

04. February 2019 18:08:45

dput usability changes

dput-ng ≥ 1.16 contains two usability changes which make uploading easier:

  1. When no arguments are specified, dput-ng auto-selects the most recent .changes file (with confirmation).
  2. Instead of erroring out when detecting an unsigned .changes file, debsign(1) is invoked to sign the .changes file before proceeding.

With these changes, after building a package, you just need to type dput (in the correct directory of course) to sign and upload it.

04. February 2019 18:08:45

pristine-tar considered harmful

If you want to follow along at home, clone this repository:

% GBP_CONF_FILES=:debian/gbp.conf gbp clone

Now, in the golang-github-go-macaron-inject directory, I’m aware of three ways to obtain an orig tarball (please correct me if there are more):

  1. Run gbp buildpackage, creating an orig tarball from git (upstream/0.0_git20160627.0.d8a0b86)
    The resulting sha1sum is d085a04b7b35856be24f8cc4a9a6d9799cdb59b4.
  2. Run pristine-tar checkout
    The resulting sha1sum is d51575c0b00db5fe2bbf8eea65bc7c4f767ee954.
  3. Run origtargz
    The resulting sha1sum is d51575c0b00db5fe2bbf8eea65bc7c4f767ee954.

Have a look at the archive’s golang-github-go-macaron-inject_0.0~git20160627.0.d8a0b86-2.dsc, however: the file entry orig tarball reads:

f5d5941c7b77e8941498910b64542f3db6daa3c2 7688 golang-github-go-macaron-inject_0.0~git20160627.0.d8a0b86.orig.tar.xz

So, why did we get a different tarball? Let’s go through the methods:

  1. The uploader must not have used gbp buildpackage to create their tarball. Perhaps they imported from a tarball created by dh-make-golang, or created manually, and then left that tarball in place (which is a perfectly fine, normal workflow).
  2. I’m not entirely sure why pristine-tar resulted in a different tarball than what’s in the archive. I think the most likely theory is that the uploader had to go back and modify the tarball, but forgot to update (or made a mistake while updating) the pristine-tar branch.
  3. origtargz, when it detects pristine-tar data, uses pristine-tar, hence the same tarball as ②.

Had we not used pristine-tar for this repository at all, origtargz would have pulled the correct tarball from the archive.

The above anecdote illustrates the fragility of the pristine-tar approach. In my experience from the pkg-go team, when the pristine-tar branch doesn’t contain outright incorrect data, it is often outdated. Even when everything is working correctly, a number of packagers are disgruntled about the extra work/mental complexity.

In the pkg-go team, we have (independently of this specific anecdote) collectively decided to have the upstream branch track the upstream remote’s master (or similar) branch directly, and get rid of pristine-tar in our repositories. This should result in method ① and ③ working correctly.

In conclusion, my recommendation for any repository is: don’t bother with pristine-tar. Instead, configure origtargz as a git-buildpackage postclone hook in your ~/.gbp.conf to always work with archive orig tarballs:

# Ensure the correct orig tarball is present.

# Pick up the orig tarballs created by the origtargz postclone hook.
tarball-dir = ..

04. February 2019 18:08:45

Debian buster on the Raspberry Pi 3 (update)

I previously wrote about my Debian buster preview image for the Raspberry Pi 3.

Now, I’m publishing an updated version, containing the following changes:

  • WiFi works out of the box. Use e.g. ip link set dev wlan0 up, and iwlist wlan0 scan.
  • Kernel boot messages are now displayed on an attached monitor (if any), not just on the serial console.
  • Root file system resizing will now not touch the partition table if the user modified it.
  • The image is now compressed using xz, reducing its size to 170M.

As before, the image is built with vmdb2, the successor to vmdebootstrap. The input files are available at

Note that Bluetooth is still untested (see wiki:RaspberryPi3 for details).

Given that Bluetooth is the only known issue, I’d like to work towards getting this image built and provided on official Debian infrastructure. If you know how to make this happen, please send me an email. Thanks!

As a preview version (i.e. unofficial, unsupported, etc.) until that’s done, I built and uploaded the resulting image. Find it at To install the image, insert the SD card into your computer (I’m assuming it’s available as /dev/sdb) and copy the image onto it:

$ wget
$ xzcat 2018-01-08-raspberry-pi-3-buster-PREVIEW.img.xz | dd of=/dev/sdb bs=64k oflag=dsync status=progress

If resolving client-supplied DHCP hostnames works in your network, you should be able to log into the Raspberry Pi 3 using SSH after booting it:

$ ssh root@rpi3
# Password is “raspberry”

04. February 2019 18:08:45

26. January 2019

sECuREs Webseite

Network setup for our retro computing event RGB2Rv18

Our computer association NoName e.V. organizes a retro computing event called RGB2R every year, located in Heidelberg, Germany. This year’s version is called RGB2Rv18.

This article describes the network setup I created for this year’s event. If you haven’t read it, the article about last year’s RGB2Rv17 network is also available.


As a reminder, the venue’s DSL connection tops out at a megabit or two, so we used my parent’s 400 Mbit/s cable internet line, like last year.

A difference to last year is that we switched from the tp-link CPE510 devices to a pair of Ubiquiti airFiber24. The airFibers are specified to reach 1.4 Gbit/s. In practice, we reached approximately 700 Mbps displayed capacity (at a signal strength of ≈-60 dBm) and 422 Mbit/s end-to-end download speed, limited by the cable uplink.

Notably, using a single pair of radios removes a bunch of complexity from the network setup as we no longer need to load-balance over two separate uplinks.

Like last year, the edge router for the event venue was a PC Engines apu2c4. For the Local Area Network (LAN) within the venue, we provided a few switches and WiFi using Ubiquiti Networks access points.

WiFi setup

It turns out that the 24 GHz-based airFiber radios are much harder to align than the 5 GHz-based tp-links we used last year. With the tp-link devices, we were able to easily obtain a link, and do maybe 45 minutes of fine tuning to achieve maximum bandwidth.

With the airFiber radios mounted in the same location, we were unable to establish a link even once in about 1.5 hours of trying. We think this was due to trees/branches being in the way, so we decided to scout the property for a better radio location with as much of a direct line of sight as possible.

We eventually found a better location on the other side of the house and managed to establish a link. It still took us an hour or so of fine tuning to move the link from weak (≈-80 dBm) to okay (≈-60 dBm).

After the first night, in which it rained for a while, the radios had lost their link. We think that this might be due to the humidity, and managed to restore the link after another 30 minutes of re-adjustment.

It also rained the second night, but this time, the link stayed up. During rain, signal strength dropped from ≈-60 dBm to ≈-72 dBm, but that still resulted in ≈500 Mbit/s of WiFi capacity, sufficient to max out our uplink.

For next year, it would be great to use an antenna alignment tool of sorts to cut down on setup time. Alternatively, we could switch to more forgiving radios which also handle 500 Mbps+. Let me know if you have any suggestions!


In May this year, I wrote router7, a pure-Go small home internet router. Mostly out of curiosity, we gave it a shot, and I’m happy to announce that router7 ran the event without any trouble.

In preparation, I implemented TCP MSS clamping and included the WireGuard kernel module.

I largely followed the router7 installation instructions. To be specific, here is the Makefile I used for creating the router7 image:

# without dhcp6,
# as the cable uplink does not provide IPv6:
PKGS := \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

ifndef DIR
	@echo variable DIR unset
	GOARCH=amd64 gokr-packer \, \ \ \
		-overwrite_boot=${DIR}/boot.img \
		-overwrite_root=${DIR}/root.img \
		-overwrite_mbr=${DIR}/mbr.img \
		-serial_console=ttyS0,115200n8 \
		-hostname=rgb2router \

After preparing an interfaces.json configuration file and a breakglass SSH hostkey, I used rtr7-recover to net-install the image onto the apu2c4. For subsequent updates, I used rtr7-safe-update.

The Go packages under are wrappers which run software I installed to the permanent partition mounted at /perm. See gokrazy: Prototyping for more details.

Tunnel setup

Last year, we used a Foo-over-UDP tunnel after noticing that we didn’t get enough bandwidth with OpenVPN. This year, after hearing much about it, we successfully used WireGuard.

I found WireGuard to be more performant than OpenVPN, and easier to set up than either OpenVPN or Foo-over-UDP.

The one wrinkle is that its wire protocol is not yet frozen, and its kernel module is not yet included in Linux.

Traffic shaping

With asymmetric internet connections, such as the 400/20 cable connection we’re using, it’s necessary to shape traffic such that the upstream is never entirely saturated, otherwise the TCP ACK packets won’t reach their destination in time to saturate the downstream.

While the FritzBox might already provide traffic shaping, we wanted to voluntarily restrict our upstream usage to leave some headroom for my parents.

rgb2router# tc qdisc replace dev uplink0 root tbf \
  rate 16mbit \
  latency 50ms \
  burst 4000

The specified latency value is a best guess, and the burst value is derived from the kernel internal timer frequency (CONFIG_HZ) (!), packet size and rate as per

Tip: keep in mind to disable shaping temporarily when you’re doing bandwidth tests ;-).


  • We peaked at 59 active DHCP leases, which is very similar to the “about 60” last year.

  • DNS traffic peaked at about 25 queries/second, while mostly remaining at less than 5 queries/second.

  • We were able to obtain peaks of nearly 200 Mbit/s of download traffic and transferred over 200 GB of data, twice as much as last year.

26. January 2019 20:47:38

kinX: USB Hub

This post is part of a series of posts about the kinX project.


The Kinesis Advantage comes with a built-in 2-port USB hub. That hub uses a proprietary connector to interface with a PS/2 keyboard controller, so it cannot be used with a USB keyboard controller. As the built-in hub is the natural place to connect the Logitech unified receiver dongle, not being able to use the hub is mildly annoying.

The kinX MK66F keyboard controller presently needs two USB cables: one connected to the USBFS port to supply the PCB with power and receive firmware updates (via the Teensy bootloader chip), and one connected to the USBHS port for the actual keyboard device.

Lastly, even if the original built-in USB hub had internal ports (instead of a PS/2 converter), it only supports USB 1.1, nullifying any latency improvements.

Hence, I decided to build a drop-in replacement USB 2.0 hub with 2 external USB ports and 2 internal USB ports, using the same proprietary connector as the original, so that the original keyboard USB cable could be re-used.

Design phase

Unfortunately, I could not find an open hardware USB 2.0 hub design on the internet, so I started researching various USB hub chips. I quickly discarded the idea of using USB 3 due to its much stricter requirements.

In the end, I decided to go with the Cypress HX2VL series because of their superior documentation: I found a detailed data sheet, an evaluation board, the associated schematics, design checklist/guidelines, and even the evaluation board’s bill of materials.

This is what the finished build of my design looks like:


After completing my first build, I tested a few USB devices with my hub. The Logitech unified receiver dongle and the YubiKey worked fine. However, my external hard drive and my USB memory stick did not work. In the syslog, I would see:

kernel: usb 1-14.4.4: rejected 1 configuration due to insufficient available bus power

This is because the USB specification limits bus-powered hubs to 100mA per port. While high power usage does not come as a surprise for the external hard disk, it turns out that even my USB memory stick requires 200mA. This was a surprise, because that stick works on other, commercial bus-powered USB hubs.

A closer look reveals that all 3 commercial USB hubs I have tested claim to be self-powered (i.e. using an external power supply), even though they are not. This way, the kernel’s power limitation is circumvented, and up to 500mA can be used per port. In practice, the host port only supplies 500mA, so the user must be careful not to plug in devices which require more than 500mA in aggregate.

I changed the SELFPWR configuration pin to have my hub claim it was self-powered, too, and that made all USB devices I tested work fine.

EEPROM programming

When debugging the power issue, I originally thought the Maximum Power setting in the hub’s USB device descriptor needed to be raised. This turned out to not be correct: the Maximum Power refers to the power which the hub uses for its own circuitry, not the power it passes through to connected devices.

Nevertheless, it’s a nice touch to modify the device descriptor to put in a custom vendor name, product name and serial number: that way, the device shows up with a recognizable name in your syslog or lsusb(8) output, and udev rules can be used to apply settings based on the serial number.

To modify the device descriptor, an EEPROM (electrically erasable programmable read-only memory) needs to be added to the design, from which the HX2VL will read configuration.

The HX2VL allows field-programming of the connected EEPROM, i.e. writing to it via the USB hub chip. I found the Windows-only tool hard to set up on a modern Windows installation, so I wondered whether I could build a simpler to use tool.

Under the covers, the tool merely sends commands with the vendor-specific request code 14 via USB, specifying an index of the two-byte word to read/write. This can be replicated in a few lines of Go:

dev, _ := usb.OpenDeviceWithVIDPID(0x04b4, 0x6570)
eepromRequest := 14
wIndex := 0 // [0, 63] for 128 bytes of EEPROM
  eepromRequest, 0, wIndex, make([]byte, 2))

The EEPROM contents are well-described in the HX2VL data sheet, so the rest is easy.

See for the tool.

Lessons learnt

  • If possible, design the PCB in such a way that components you think you don’t need (e.g. the EEPROM) can optionally be soldered on. This would have saved me a PCB design/fabrication cycle.

  • Get the evaluation board to figure out the configuration you need (e.g. self-powered vs. bus-powered).

Next up

The last post introduces the processing latency measurement firmware for the FRDM-K66F development board and draws a conclusion.

26. January 2019 20:47:38


All we hear is … Radio Regenbogen

Achtung! Achtung! Hier Sendestelle Käfertal!

Hört her, Herzensfreunde himmlischer Hörfunksendungen!

Anhänger analoger Amplitudenmodulation, aufgepasst!

Ihr fragt euch auch seit 1980, ob Video wirklich den Radio Star gekilled hat?

Ob man aus quadratischen Radios auch Rundfunk hören kann?

Dann seid ihr genau richtig bei der nächsten RZL-Reise zu Radio Regenbogen!

Am Donnerstag, den 28. Februar 2019, werden wir ab 16 Uhr eine Runde durch das Studio in Mannheim drehen. Wir werden Gelegenheit haben, die Musikredaktion, die Sendestudios und die Tontechnik zu besichtigen und alle Fragen zu stellen, die uns auf den Nägeln brennen.

Da diese Exkursion auf 15 Plätze limitiert ist, bitte ich um Anmeldung bis zum 14. Februar 2019 per Mail. Wie immer dürfen auch (Noch-) Nicht-Mitglieder teilnehmen.

Es grüßt

das Flederradio

P.S.: Und bitte nicht vergessen, die Antenne zu erden!

by flederrattie at 26. January 2019 00:00:00

08. January 2019

PostDoc position at CERMICS / Sorbonne Université

Last week I started my new position as a postdoctoral fellow at the MATHERIALS team of the Centre d'Enseignement et de Recherche en Mathématiques et Calcul Scientifique (CERMICS) laboratory and Inria Paris with an additional affiliation to the Institut des Sciences du Calcul et des Données at Sorbonne Université. As can already be guessed from the names of the aforementioned labs, each cover research topics in an interdisciplinary research field. At CERMICS, for example, applied mathematicians are working on topics related to the molecular sciences, optimisation or finance.

Along this line my new direction of research stays interdisciplinary as well: Using the programming language julia we would like to provide a highly modular and flexible plane-wave density-functional theory code for the modelling of solid-state materials. The main aim of the project is to allow for novel mathematical approaches in this context to be readily implemented and tested on a scale going beyond simple toy problems. In order to achieve this, the code needs to stay simple and accessible, but still perform decently --- a challenge, which can in our opinion be tackled rather well using julia and its rapidly-growing number of modules. As mentioned previously I like the modern concepts of the julia language very much and thus I am eager about this opportunity to try it for coding a complete framework for modelling a scientific problem.

Right now I know rather little about the topic of modelling solid-state materials, but fortunately I am not alone on this project. Rather I will be working jointly with Antoine Levitt, Eric Cancès, Laura Grigori and Eleonora Luppi from either CERMICS, Inria or Sorbonne Université on this project.

by Michael F. Herbst at 08. January 2019 23:00:00

03. January 2019


Crêpiphanie-Fest – Jahresanfangs-Kickoff-Motivations-Heilige-Drei-Waffel-Frühstück

Ihr lieben RaumZeitLaborierenden,

wieder sind 12 ereignisreiche Monate vorbei gegangen.

Um das beginnende Jahr 2019 ein bisschen zu planen und die Post-Congress-Traurigkeit zu mildern, wollen wir am Hochneujahrstag gemeinsam ein Crêpiphanie-Fest feiern. Aus den wahren Gaben der Heiligen Drei Könige (natürlich Milch, Eier, Mehl … Was sonst?) werden wir dünne Pfannkuchen und dicke Waffeln backen, und können nebenbei über unsere guten Hackerspace-Vorsätze fürs neue Jahr reden. Was war euer schönstes Ferien-Erlebnis auf dem 35C3, was habt ihr als Motivation mitgenommen? Welche Aktionen, Projekte oder Feste plant ihr im RZL umzusetzen? Habt ihr Wünsche/Ideen für Ausflüge? Wer fährt alles zum CCCamp und gibt es eventuell schon Ideen für eine Assembly?

Ihr seid am Sonntag, den 6. Januar 2019 ab 11 Uhr beim Teigflädleinbacken dabei? Um eine kurze Rückmeldung bis Samstagmittag per Mail wird gebeten, damit ausreichend süßer und herzhafter Belag vorhanden ist. Um eine Gabe in Höhe von 5 bis 10 Euro in die Eierkuchenkasse wird gebeten.

Gesegnete Grüße sendet euch eure flederwaffel

by flederrattie at 03. January 2019 00:00:00

28. December 2018

SelfConsistentField.jl: A julia toolbox for coding self-consistent field algorithms

Over the past half a year I became interested in the julia programming language. It is a rather recent language (version 1.0.0 just released last August), which promises to provide a modern approach to scientific computing. From my experience so far, I like julia a lot. All aspects of the code, let it be parallelisation, vectorisation, macros, generic code and so on, are consistently designed and fit together very well. In contrast to e.g. python, it is thus not required to employ a specialised external package with its associated way of doing the computation in parallel for a large-scale problem. In julia parallelisation is usually automatic or can be added with little manual effort. Whilst julia code might not necessarily beat C++ code with respect to performance in all cases, it certainly is written much faster.

To test julia in a larger setting than just a toy project, I decided to work towards a julia toolbox for coding and experimenting with self-consistent field algorithms. Over the summer Tobias Wackenhut participated in the development during his internship with us in Heidelberg. After about two solid months of coding, mostly from his end, a first version with support for solving Hartree-Fock problems has now been implemented.

For further details, the project code and some examples, see the SelfConsistentField.jl project page on github.

by Michael F. Herbst at 28. December 2018 23:00:00

23. December 2018

Design of the molsturm quantum-chemistry framework

A few weeks ago, on 23rd November, I was invited to speak at the mathematics and chemistry seminar (GdT mathématiques et chimie) at Sorbonne Université in Paris. This seminar is a joint meeting of theoretical chemists and mathematicians from the LJLL (Laboratoire Jacques-Louis Lions) and the LCT (Laboratoire de Chimie Théorique) of Sorbonne Université and the CERMICS laboratory of Inria Paris and ENPC (Ecole des Ponts Paris Tech). Such a mixed audience provided a good opportunity to present our recent work on the molsturm quantum chemistry framework and our recent Coulomb-Sturmian convergence results. Both the feedback and the discussion accompanying the talk were fruitful and some ideas how to provide a more mathematical backbone for our empirical observations emerged.

My travels to Paris were supported by the DAAD, which I gratefully acknowledge. The slides of my talk are attached below:

Link Licence
Slides Design of the molsturm quantum-chemistry framework Creative Commons License

by Michael F. Herbst at 23. December 2018 17:00:00

15. November 2018

Quantum chemistry with Coulomb Sturmians: Construction and convergence of Coulomb Sturmian basis sets at Hartree-Fock level

This week we submitted a paper related to our ongoing work regarding the use of Coulomb Sturmian basis functions in electronic structure theory. Given the recent progress, which has been made with regards to the evaluation of molecular integrals based on these exponential-type functions, molecular calculations based on Coulomb Sturmians are within reach. This is a very promising prospect due to the abilities of the Coulomb Sturmians to represent the features of the wave function both at the nucleus as well as at large distances.

Our work takes a first look at the convergence of Coulomb Sturmian discretisations in electronic structure theory, expanding on ideas suggested in my PhD thesis. We suggest a simple construction scheme for Coulomb Sturmian basis sets and discuss its properties. A key aspect is to connect the basis set parameters to physical features of the wave function or other chemically intuitive quantities such as the exponents obtained by Clementi and Raimondi, i.e. the effective nuclear charge. The abstract of the paper reads

The first discussion of basis sets consisting of exponentially decaying Coulomb Sturmian functions for modelling electronic structures is presented. The proposed basis set construction selects Coulomb Sturmian functions using separate upper limits to their principle, angular momentum and magnetic quantum numbers. Their common Coulomb Sturmian exponent is taken as a fourth parameter. The convergence properties of such basis sets are investigated for second and third row atoms at the Hartree-Fock level. Thereby important relations between the values of the basis set parameters and the physical properties of the electronic structure are recognised. For example, an unusually large limit for the angular momentum quantum number in unrestricted Hartree-Fock calculations can be linked to the breaking of spherical symmetry in such cases. Furthermore, a connection between the optimal, i.e. minimum-energy, Coulomb Sturmian exponent and the average Slater exponents values obtained by Clementi and Raimondi (E. Clementi and D. L. Raimondi, J. Chem. Phys. 38, 2686 (1963)) is made. These features of Coulomb Sturmian basis sets emphasise their ability to correctly reproduce the physical features of Hartree-Fock wave functions.

by Michael F. Herbst at 15. November 2018 10:00:00

18. October 2018

Mero’s Blog

Using roughtime as a "cryptographic notary"

tl;dr: Roughtime can be (ab)used for Trusted Timestamping. I wrote a simple tool as a PoC

Recently, Cloudflare announced that they are now running a roughtime server. Roughtime is a cryptographically secured time-synchronization protocol - think NTP with signatures. For an actual description of how it works, I recommend reading the Cloudflare blog post. But at it's very core (oversimplification ahead), the user chooses an arbitrary (usually randomly generated) nonce and the server signs it, plus the current time.

One thing roughtime adds on top of this, is the ability to build a chain of requests. This is achieved by taking a hash of a response, combining it with a randomly generated "blind" and using the combined hash as a nonce to the next request. The intended use-case of this is that if a server provides the wrong time or otherwise misbehaves, you can obtain cryptographic proof of that fact by getting a timestamped signature of its response from a different server. By storing the initial nonce, generated blinds and responses, the entire chain can be validated later.

When I saw Cloudflares announcement, my first thought was that it should be possible to use a roughtime server as a Time Stamping Authority. The goal is, to obtain a cryptographic proof, that you owned a particular document at the current point in time - for example to ensure you can proof original authorship without publishing the document itself.

The simplest way to achieve this using roughtime is to use the SHA512 hash of the file as an initial nonce. That way, the roughtime server signs that hash together with the current time with their private key. By using the roughtime chain protocol, you can get that proof corroborated by multiple servers.

You can also think of extending this, to get stronger properties. Using the hash of the file as a nonce only proves that the file existed before that specific point in time. It also doesn't actually prove that you had the file, but only the hash. This can be remediated though. If we run a regular roughtime request, the resulting response is unpredictable (to us) and signs the current time. Thus, if we use a hash of that response as a prefix "salt" of the file itself, the resulting hash proofs that we knew the file after that chain ran. We can then use that hash as a nonce for another roughtime chain and get a proof that we had the file at a specific point (or rather, a small interval) in time. Furthermore, we can opt to use the file-hash not as the nonce itself, but as a blind. The advantage is, that the blind is never transmitted over the network, so the actual proof is only available to us (if we use it as a nonce, an eavesdropper could intercept the proof). I illustrated these options in a recent talk I gave on the subject.

These ideas are mostly academic. I'm not sure how useful these properties are in practice. Nevertheless, the idea intriguiged me enough to implement it in a simple tool. It's in a pretty rough, proof-of-concept like shape and I don't know if I will ever progress it from there. It also comes with a client implementation of the roughtime protocol in Go - initially I was not aware that there already was a Go implementation, but that also is not go-gettable. Either way, it was fun to implement it myself :)

18. October 2018 23:22:40

02. October 2018


Ontbyte – Niederländisches Spätstück im RZL

Ontbyte Kunst

Vla’ntastische Neuigkeiten für alle Spätstücker und Frikandel-Fans!

Am Sonntag, den 14.Oktober 2018, ist der Specu’loos! Ab 11.30 Uhr treffen wir uns zum gemeinsamen Ontbyte im RaumZeitLabor.

Diesmal möchten wir unter dem Motto “Programmeren en Degusteren” unseren niederländischen Nachbarn mit allerhand kulinarischen Spezialitäten huldigen.

Egal ob ihr mit Hagelslag in den Tag starten wollt oder eher Spaß mit Pindakaas auf eurer morgendlichen To-Do-Liste steht - Für alle wird etwas dabei sein!

Ihr wollt mitessen? Kein Problem – Damit wir wissen, wie viele Gehaktballen wir rollen müssen, freuen wir uns über Rückmeldung per Mail bis zum 7. Oktober 2018. Pro Kopf bitten wir um die Einzahlung eines Beitrags von 10 Euro in die Kaas-Kassa.

Eet Smakelijk! Eure Frau Antje

by flederrattie at 02. October 2018 00:00:00

30. September 2018

Simulating chemistry: Enabling novel approaches for modelling the electronic structure of molecules

Last week Wednesday I was invited to present my research in front of a bunch of young researchers from the Heidelberg Laureate Forum visiting our institute. Whilst I would have loved to attend the full week of the forum, I am happy that in this way I was able to contribute a little to this meeting and participate in discussions with other participants.

Instead of too much going into the details of my work, I decided to provide a general overview of modern quantum-chemical approaches towards modelling chemical properties and reactions. Only very briefly, at the end of my talk, I very briefly talk about the work concerning our recent molsturm paper. As usual my slides are attached below:

Link Licence
Slides Simulating chemistry Creative Commons License

by Michael F. Herbst at 30. September 2018 16:00:00

18. September 2018

Franco-German Workshop 2018: Open problems related to the algebraic-diagrammatic construction scheme

In some of my recent projects, e.g. the PE-ADC paper, I got a little more into method development surrounding the algebraic-diagrammatic construction scheme (ADC). Even though quite a few people from the Dreuw group in Heidelberg are involved with ADC development and we have regular talks on ADC, I did not yet have the chance to really take a deeper look at this method myself.

Last week I was invited to the Franco-German Workshop on mathematical aspects in computational chemistry 2018, where a selection of chemists and mathematicians from France and Germany and a few other places got together to talk about their recent advances, but most importantly to discuss. From the ADC side I had already collected a few potential angles for improvement in recent month and this occasion thus seemed ideal to both brush up on my knowledge of ADC and at the same time present on ADC.

In the spirit of such a rather hands-on-type workshop, I only quickly glanced over the current status of ADC as an excited-states method. Afterwards introduced common numerical procedures and pointed at currently open problems and challenges. From the discussions directly after the lecture as well as the remaining days of the workshop, a few interesting ideas emerged and I am already looking forward to trying them out. As per usual the slides of my talk are attached below.

Link Licence
Slides Challenges and open problems related to the algebraic diagrammatic construction scheme Creative Commons License

by Michael F. Herbst at 18. September 2018 16:00:00


Agenda-Aktion 2018 – Wir löten ein elektronisches Spielzeug

In Zusammenarbeit mit der Stadt Mannheim haben wir auch dieses Jahr wieder Lötworkshops für Kinder angeboten. An insgesamt fünf Terminen durften jeweils 12 Kinder im Alter von 8-12 Jahren den Lötkolben schwingen. Fleissige Kinder beim Löten

Als Bausatz gab es diesmal einen elektronischen Würfel. Dieses Jahr hat uns das Technologie-Beratungsunternehmen INNOQ die Bausätze gesponsort. Mit wenigen Komponenten war der Würfel sehr einfach zu bauen. Die Schaltung basiert auf einem ATTiny85, zeigt schön gleichmäßig verteilte Zahlen an und dank des Tiefschlafmodus wird nicht einmal ein Ein-Aus-Schalter benötigt. Platinenlayout, Sourcecode und 3D-Modelle stehen wie immer auf unserem Github zur Verfügung: Der Würfel auf Github Der fertige Würfelbausatz

Den selbstgebauten Würfel durften alle Kinder gleich einmal bei einer Runde Mensch-ärgere-Dich-nicht ausprobieren. Brettspiel mit elektronischem Würfel

by Pfeyffer at 18. September 2018 00:00:00

09. September 2018

MRMCD18: Pitfalls for performance: Latencies to keep in mind

This weekend I attended the MRMCD18 in Darmstadt, which is one of the many chaos events where a few hundred people interested in computer science, hacking, making, politics and society gather for a weekend of talks, coding and discussion. Having participated in this event regularly in the past few years, I was very happy that this year I was given the chance to present a lecture.

I chose to talk about lazy matrices and contraction-based algorithms. Even though this topic is discussed briefly in my PhD thesis, I typically keep it rather short in my typical talks. Similarly the molsturm design paper was only brief with respect to this subject. In this talk I therefore took the chance to expand on contraction-based methods a bit further, tackling the matter by taking a look at latency numbers in modern hardware and from there drawing conclusions with respect to numerical linear algebra.

Thanks to the great C3VOC team the talk was recorded live and can be watched on or downloaded below, along with my slides.

Note: Unfortunately the batteries of the microphone gave up in the middle of my talk, around 25:20. However, if you seek the video forward to minute 26:17, I switch to a handheld and audio is back.

Link Licence
Recording Pitfalls for performance:
Latencies to keep in mind
Creative Commons License
Slides Pitfalls for performance Creative Commons License

by Michael F. Herbst at 09. September 2018 18:00:00

05. September 2018

Mero’s Blog

Scrapping contracts

tl;dr: I describe a way to simplify the generics design. The ideas are not particularly novel and have been expressed to various degrees by other people as well. I hope to provide a more complete view of the design though.

Recently a Problem Overview and Draft Design for generics in Go have dropped. Since then, predictably, there has been a bunch of chatter on the intertubez about it. This is a summary of my thoughts, so far, on the subject - after a bunch of discussions on Twitter and Reddit.

Note: The design is called "Contracts", but I will refer to it as "the design doc" here. When I say "contracts", I will refer to the specific part of the design to express constraints.

Contracts vs. Interfaces

First, there is a common observation of overlap between generics and interfaces. To untangle that, we can say that when we use "generics", what we mean is constrained parametric polymorphism. Go already allows polymorphism by using interfaces. This desgn doc adds two things: One, a way to add type-parameters to functions and types. And two, a syntax to constrain those type-parameters to a subset that allows specific operations, via contracts.

The latter is where the overlap lies: Interfaces already allow you to constrain arguments to types that allow certain operations. In a way, what contracts add to this, is that those operations can not only be method calls, but also allow (and constrain) builtin operators and functions to be used and to allow or disallow certain composite types (though that mainly affects map).

Contracts allow that by the way they are specified: You write a function body (including arguments, whose notational type becomes the type-variable of the contract) containing all the statements/expressions you wish to be able to do. When instantiating a generic type/function with a given set of type-arguments, the compiler will try to substitute the corresponding type-variable in the contract body and allow the instantiation, if that body type-checks.

The cost of contracts

After talking a bit through some examples, I feel that contracts optimize for the wrong thing. The analogy I came up with is vocabulary vs. grammar.

The contracts design is appealing to a good degree, because it uses familiar syntax: You don't have to learn any new syntax or language to express your contract. Just write natural Go code and have that express your constraints for you. I call this the "grammar" of constraints: The structure that you use to input them.

On the other hand, for the user of Go, the relevant question is what constraints are possible to express and how to express them. They might be interested in deduplicating values in their algorithm, which requires equality-operations. Or they might want to do comparisons (e.g. Max), which requires >. I call this the vocabulary: What is the correct way to express the set of constraints that my algorithm needs?

The issue now, is that while the grammar of constraints might be obvious, it is not always clear what the actual semantic constraints that generates are. A simple example is map-keys. The design doc uses the contract

contract comparable (t T) {
   t == t

to specify types that are valid map-keyes. But to a beginner, it is not immediately obvious, what comparisons have to do with maps. An alternative would be

contract mapkey (t T) {
  var _ map[t]bool

But which is better? Similarly, these two contracts

contract mult (t T) {
  t = t * t

contract add (t T) {
  t = t + t

seem very similar, but they are, in theory at least, fundamentally different. Not only because add allows string, while mult doesn't. But also, because technically any type that supports * also supports - and /. And then there's

contract div (t T) {
  t = t % t

which creates another completely different set of types and allowed operators.

A third example is

contract stringlike (t T) {
  append([]byte(nil), t...)

This allows any type with underlying type string or []byte, but nothing else. And again, technically that would imply allowing index-operations and len. But does the compiler understand that?

Lastly, it's not really clear how len, cap, make or range would work. For example, all these contracts are superficially valid:

contract rangeable (t T) {
  for x := range t {

contract lengthed (t T) {
  var _ int = len(t)

contract capped (t T) {
  var _ int = cap(t)

contract makeable (t T) {
  t = make(T)

contract makeable2 (t T) {
  t = make(T, 0)

But in all these cases, they allow some subset of channel, map, slice and array types, with vastly different interpretations of these operations, depending on the kind of type used - to the degree, that code using them would usually be nonsensical. Disallowing these, however, opens questions about the claim of familiar Go syntax, as we now have to make decisions what sort of expressions and statements we do or don't allow in a contract.

This is why I say contracts optimize for grammar, instead of vocabulary. The programmer is interested in the vocabulary - what does the contract actually mean and what contract should they use? But the vocabulary is obscured by the grammar - because we use Go syntax, to understand a given contract we need to know a bunch of things about what the compiler is and is not able to infer from it.

This is why I don't really buy the argument of not wanting to learn a bunch of new syntax or new identifiers for constraints: You still have to learn that vocabulary, but you express it in an obscure and unnatural grammar. I hope to show that we can introduce the power of generics while also using familiar grammar and with minimal addition of vocabulary.

Scrapping contracts

Now, I'm not the first person to suggest this, but I think we should consider scrapping contracts from the design. We can still retain type-parameters and we can still have constraints, but we express them via interfaces instead. I should point out, that - for now - I'm intentionally optimizing for simplicity of the design, at the cost of some boilerplate and some loss of power. I will later try and provide some alternatives to compensate for that in part. But there is still likely going to remain a net cost in expressiveness. Personally, I think that tradeoff is worth exploring.

The new design would retain type-parameters and most of their syntax. The difference is that type-parameters are a full argument list. The type of an argument has to be an interface type. It can be ellided, in which case it defaults to the type of the following type-parameter. The last type-parameter defaults to interface{}. As a bonus, this allows providing multiple sets of constraints on one declaration:

func Map(type A, B) (s []A, f func(A) B) []B {
  var out []B
  for _, a := range s {
    out = f(a)
  return out

func Stringify(type A fmt.Stringer) (s []A) []string {
  // Because of the signature of fmt.Stringer.String, we can infer all the
  // type-arguments here. Note that A does not *have* to be passed boxed in an
  // interface. A.String is still a valid method-expression for any fmt.Stringer.
  return Map(s, A.String)

We still want to be able to express multiple, interdependent parameters, which we can, via parametric interfaces:

type Graph(type Node, Edge) interface {
  Nodes(Edge) []Node
  Edges(Node) []Edge

func ShortestPath(type Node, Edge) (g Graph(Node, Edge), from, to Node) []Edge {
  // …

// Undirected Graph as an adjacency list. This could be further parameterized,
// to allow for user-defined paylooads.
type AdjacencyList [][]int

func (g AdjacencyList) Nodes(edge [2]int) []int {
  return edge[:]

func (g AdjacencyList) Edges(node int) [][2]int {
  var out [][2]int
  for _, v := range g[node] {
    out = append(out, [2]int{node, v}
    if v != node {
      out = append(out, [2]int{v, node})
  return out

func main() {
  g := AdjacencyList{…}
  // Types could be infered here, as the names of methods are unique, so we can
  // look at the methods Nodes and Edges of AdjacencyList to infer the
  // type-arguments.
  path := ShortestPath(g, 0, len(g)-1)

The last example is relevant to the difference in power between contracts and interfaces: Usage of operators. We can still express the concept, but this is where the increased boilerplate comes in:

func Max(type T)(a, b T, less func(T, T) bool) T {
  if less(a, b) {
    return b
  return a

func main() {
  fmt.Println(Max(a, b int, func(a, b int) { return a < b }))

I will try to show some ways to get rid of that boilerplate later. For now, let's just treat it as a necessary evil of this idea. Though it should be mentioned, that while this is more cumbersome, it's still just as typesafe as contracts (as opposed to, say, a reflect-based generic Max).

So, scrapping contracts leaves us with more boilerplate, but just the same set of concepts we can express - though we do have to pass in any builtin operations we want to perform as extra functions (or express them in an interface). In exchange, we get

  • Only one way to specify constraints.
  • A simpler spec (we don't need to add a new concept, contracts, to the language) and a saved (pseudo-)keyword.
  • A simpler compiler: We don't need to add a solver to deduce constraints from a given contract. The constraint-checker already exists.
  • Still a well-known, though less powerfull, language to express constraints, with interfaces.
  • Simple syntax (same as normal arglists) for having multiple sets of constraints in one declaration.
  • Trivially good error messages. Types passed in need only be checked for consistency and interface satisfaction - the latter is already implemented, including good error messages.

Getting rid of boilerplate

I see two main ways to get rid of boilerplate: Adding methods to builtin types, or what I call pseudo-interfaces.

Methods on builtin types

An obvious idea is to not use operators in generic code, but instead use method-call syntax. That is, we'd do something akin to

func Max(type T Ordered) (a, b T) T {
  if a.Less(b) {
    return b
  return a

To actually reduce the boilerplate, we'd predefine methods for all the operators on the builtin types. That would allow us to call Max with int, for example.

Unfortunately, I can see a bunch of roadblocks to make this work. Methods are not promoted to derived types, so you couldn't use Max with e.g. time.Duration, which has underlying type int64, but is not the same type. We'd probably want those methods to be "special" in that they automatically get promoted to any type whose underlying type is predeclared. That introduces compatibility issues of clashing Method/Field names.

At the end, to express that Less has to take the same argument as the receiver type, Ordered might look something like this:

type Ordered(T) interface {
  Less(T) bool

func Max(type T Ordered(T)) (a, b T) T {
  if a.Less(b) {
    return b
  return a

// In the universe block:

// Implements Ordered(int).
func (a int) Less(b int) bool {
  retun a < b

Though it's not clear, whether a parameter like T Ordered(T) should be allowed. And this would technically allow to implement Ordered(int) on a custom type. While that probably won't be very useful (the majority of usecases will require T Ordered(T)), it's not excluded.


Unfortunately I didn't have a lot of time the last couple of days, so I got beat to the punch on this. Matt Sherman described the idea first and called the concept "typeclasses". I will stick with pseudo-interface, because it fits better in the general concept of this description.

The idea is to introduce a set of types into the language that can be used like interfaces (including embedding), but instead of providing methods, provide operators. There is a limited set of base types that need to be provided:

pseudo-interface | Allowed operators
comparable       | ==, !=
ordered          | <, <= > >=
boolean          | ||, &&, !
bitwise          | ^, %, &, &^, <<, >>
arith            | +, -, *, /
concat           | +
complex          | real(z), imag(z)
nilable          | v == nil

and a set of derived pseudo-interfaces:

pseudo-interface | definition
num              | interface { comparable; ordered; arith }
integral         | interface { num; bitwise }
stringy          | interface { comparable; ordered; concat; len() int }
iface            | interface { comparable; nilable }

The pseudo-interfaces would be declared in the universe block, as predeclared identifiers. This makes them backwards-compatible (as opposed to methods on builtin types), because any existing identifier would just shadow these (akin to how you can have a variable with name string).

Bitshift-operators currently are restricted when used with constants overflowing the width of an integral type. For generic code, this restriction would be lifted (as the size is not statically known) and instead the behavior is equivalent to if the right operand is an uint variable with the given value.

This would allow us to write

func Max(type T ordered) (a, b T) T {
  if a < b {
    return b
  return a

Notably, the list of pseudo-interfaces doesn't include anything related to channel-, slice- or map-operations (or other composite types). The idea is to instead use a type literal directly:

type Keys(type K, V) (m map[K]V) []K {
  var out []K
  for k := range m {
    out = append(out, k)
  return out

As every type supporting, e.g. map operations, need to have underlying type map[K]V, it's thus assignable to that type and can be passed to Keys as is. That is, this is completely legal:

func main() {
  type MyMap map[string]int
  var m = MyMap{
    "foo": 23,
    "bar": 42,

This also solves another problem with contracts: The ambiguity of len, cap and range. As the actual kind of the value is not only known during compilation of the generic function, but even obvious from the code, there is no question about the intended semantics.

Should Go ever grow operator overloading via operator methods, the pseudo-interfaces could be changed into actual interfaces, containing the necessary methods. Of course, that implies that operator overloading would retain the properties of existing operators, e.g. that having == implies having !=, or having - implying having +. Personally, I consider that a good thing - it limits the abuse of operator overloading for nonsensical operations (say, << for writing to an io.Writer).

I'm not trying to advocate for operator overloading, but think it's worth mentioning that this design leaves the door open to that.

But performance

A possible criticism of either of these approaches is, that operators have better performance than dynamic dispatch to a method. I believe (vigorous handwaving ahead) that this is no different in the existing contracts proposal. If generic code is compiled generically, it still needs to employ some means of dynamic dispatch for operators. If, on the other hand, it's compiled instantiated, then the compiler would also be able to devirtualize the interfaces - and then inline the method definition.


I've previously said that I'm "meh" on the design doc, which is the strongest form of endorsement a generics proposal could ever get from me. After some discussion, I'm more and more convinced that while contracts seem conceptually simple, they create a plethora of implementation- and usage questions. I'm not sure, the supposed advantage of contracts, of a well-known syntax, holds up to scrutiny when it comes to mapping that to the actually derived constraints or writing contracts. There are also many open questions in regards to contracts, a bunch of them related to the ambiguity of Go-expressions. As a result, I'm starting to feel more negative towards them - they look like an elegant idea, but in practice, they have a lot of weird corners.

This design is similar (AIUI) to the type functions proposal, so I assume there are good reasons the Go team does not want this. The difference is mainly the absence of operator methods in favor of pseudo-interfaces or explicit method calls. This design also handwaves a couple of important implementation questions - the justification for that is that these questions (e.g. type inference and code generation) should be able to be taken from the design doc with minimal changes. It's entirely possible that I am overlooking something, though.

05. September 2018 04:00:00

14. July 2018


Spiralgalaxien, Sterne, Staunen – Ein Bericht unserer astronomischen Führung

Vergangene Woche haben wir uns auf den Weg zum Heidelberger Königstuhl gemacht. Neben dem Haus der Astronomie und dem Max-Planck-Institut für Astronomie befindet sich dort oben mit der Landessternwarte Königstuhl ein Teil der größten universitären Einrichtung für astronomische Forschung und Lehre Deutschlands.

Der erste Teil unserer Astrotour brachte uns zu einem über hundert Jahre alten Teleskop in einer der Kuppeln der 1898 eröffneten Landessternwarte. Nachdem wir händisch das Teleskop Richtung Sonne gedreht hatten, durften die Pyromanen unter uns Experimente mit Papier und Lichtbündelung unter Linsen machen.

Im Kuppelprojektionsauditorium des HdA reisten wir mit Hilfe des digitalen Planetariumssystems virtuell durch das Universum, statteten Mark Watney einen Besuch auf dem Mars ab und ließen uns alle unsere Fragen zum Thema Astronomie und Forschung beantworten.

Der letzte Teil unserer Tour führte uns ins Max-Planck-Institut für Astronomie. Dort konnten wir den »Sternenraum« bestaunen – ein von zwei Masterstudenten gebautes 3D-Modell eines Stück Universums. Dargestellt durch über 100 LEDs hängen Sterne und Galaxien an feinen Drähten im Raum. Zum Abschluss unserer Führung durften wir als Vergleich zum Landessternwarten-Teleskop noch ein moderneres 70cm Spiegelteleskop in der Ostkuppel des MPIA bewundern.

Unser Dank für diesen spannenden Nachmittag gilt Stefan, unserem Tourguide und der Presse- und Öffentlichkeitsarbeit des HdA und des MPIA.

Astrotour Impressionen

by flederrattie at 14. July 2018 00:00:00