Planet NoName e.V.

08. January 2019

PostDoc position at CERMICS / Sorbonne Université

Last week I started my new position as a postdoctoral fellow at the MATHERIALS team of the Centre d'Enseignement et de Recherche en Mathématiques et Calcul Scientifique (CERMICS) laboratory and Inria Paris with an additional affiliation to the Institut des Sciences du Calcul et des Données at Sorbonne Université. As can already be guessed from the names of the aforementioned labs, each cover research topics in an interdisciplinary research field. At CERMICS, for example, applied mathematicians are working on topics related to the molecular sciences, optimisation or finance.

Along this line my new direction of research stays interdisciplinary as well: Using the programming language julia we would like to provide a highly modular and flexible plane-wave density-functional theory code for the modelling of solid-state materials. The main aim of the project is to allow for novel mathematical approaches in this context to be readily implemented and tested on a scale going beyond simple toy problems. In order to achieve this, the code needs to stay simple and accessible, but still perform decently --- a challenge, which can in our opinion be tackled rather well using julia and its rapidly-growing number of modules. As mentioned previously I like the modern concepts of the julia language very much and thus I am eager about this opportunity to try it for coding a complete framework for modelling a scientific problem.

Right now I know rather little about the topic of modelling solid-state materials, but fortunately I am not alone on this project. Rather I will be working jointly with Antoine Levitt, Eric Cancès, Laura Grigori and Eleonora Luppi from either CERMICS, Inria or Sorbonne Université on this project.

by Michael F. Herbst at 08. January 2019 23:00:00

03. January 2019


Crêpiphanie-Fest – Jahresanfangs-Kickoff-Motivations-Heilige-Drei-Waffel-Frühstück

Ihr lieben RaumZeitLaborierenden,

wieder sind 12 ereignisreiche Monate vorbei gegangen.

Um das beginnende Jahr 2019 ein bisschen zu planen und die Post-Congress-Traurigkeit zu mildern, wollen wir am Hochneujahrstag gemeinsam ein Crêpiphanie-Fest feiern. Aus den wahren Gaben der Heiligen Drei Könige (natürlich Milch, Eier, Mehl … Was sonst?) werden wir dünne Pfannkuchen und dicke Waffeln backen, und können nebenbei über unsere guten Hackerspace-Vorsätze fürs neue Jahr reden. Was war euer schönstes Ferien-Erlebnis auf dem 35C3, was habt ihr als Motivation mitgenommen? Welche Aktionen, Projekte oder Feste plant ihr im RZL umzusetzen? Habt ihr Wünsche/Ideen für Ausflüge? Wer fährt alles zum CCCamp und gibt es eventuell schon Ideen für eine Assembly?

Ihr seid am Sonntag, den 6. Januar 2019 ab 11 Uhr beim Teigflädleinbacken dabei? Um eine kurze Rückmeldung bis Samstagmittag per Mail wird gebeten, damit ausreichend süßer und herzhafter Belag vorhanden ist. Um eine Gabe in Höhe von 5 bis 10 Euro in die Eierkuchenkasse wird gebeten.

Gesegnete Grüße sendet euch eure flederwaffel

by flederrattie at 03. January 2019 00:00:00

28. December 2018

SelfConsistentField.jl: A julia toolbox for coding self-consistent field algorithms

Over the past half a year I became interested in the julia programming language. It is a rather recent language (version 1.0.0 just released last August), which promises to provide a modern approach to scientific computing. From my experience so far, I like julia a lot. All aspects of the code, let it be parallelisation, vectorisation, macros, generic code and so on, are consistently designed and fit together very well. In contrast to e.g. python, it is thus not required to employ a specialised external package with its associated way of doing the computation in parallel for a large-scale problem. In julia parallelisation is usually automatic or can be added with little manual effort. Whilst julia code might not necessarily beat C++ code with respect to performance in all cases, it certainly is written much faster.

To test julia in a larger setting than just a toy project, I decided to work towards a julia toolbox for coding and experimenting with self-consistent field algorithms. Over the summer Tobias Wackenhut participated in the development during his internship with us in Heidelberg. After about two solid months of coding, mostly from his end, a first version with support for solving Hartree-Fock problems has now been implemented.

For further details, the project code and some examples, see the SelfConsistentField.jl project page on github.

by Michael F. Herbst at 28. December 2018 23:00:00

23. December 2018

Design of the molsturm quantum-chemistry framework

A few weeks ago, on 23rd November, I was invited to speak at the mathematics and chemistry seminar (GdT mathématiques et chimie) at Sorbonne Université in Paris. This seminar is a joint meeting of theoretical chemists and mathematicians from the LJLL (Laboratoire Jacques-Louis Lions) and the LCT (Laboratoire de Chimie Théorique) of Sorbonne Université and the CERMICS laboratory of Inria Paris and ENPC (Ecole des Ponts Paris Tech). Such a mixed audience provided a good opportunity to present our recent work on the molsturm quantum chemistry framework and our recent Coulomb-Sturmian convergence results. Both the feedback and the discussion accompanying the talk were fruitful and some ideas how to provide a more mathematical backbone for our empirical observations emerged.

My travels to Paris were supported by the DAAD, which I gratefully acknowledge. The slides of my talk are attached below:

Link Licence
Slides Design of the molsturm quantum-chemistry framework Creative Commons License

by Michael F. Herbst at 23. December 2018 17:00:00

12. December 2018

sECuREs Webseite

Docker and IPv6

My use-case is seemingly very simple: I want to run a webserver in a Docker container, and it should be reachable via IPv4 and IPv6. The webserver has multiple virtual hosts, some of which just serve static files, while others proxy to, say, a Grafana instance, which is also running in a Docker container.

This article walks through the required steps, which are a bit cumbersome to puzzle together from Docker’s official documentation.

I’m using documentation-only IPs (RFC3849 and RFC5737) throughout the article. Let’s say that my provider gives me a routed IPv6 subnet 2001:db8:13b:330::/64 and the IPv4 address

Enabling IPv6 in Docker

The Docker daemon defaults to IPv4-only. To enable IPv6, create the configuration file /etc/docker/daemon.json with the following content:

  "ipv6": true,
  "fixed-cidr-v6": "2001:db8:13b:330:ffff::/80"

After restarting the Docker daemon, containers will now get IPv6 addresses based on their MAC address, which is picked sequentially from the range 02:42:ac:11:00:00 to 02:42:ac:11:ff:ff. That is, the first container you start will use the IPv6 address 2001:db8:13b:330:ffff:0242:ac11:0002.

Publishing ports and remote addresses

When publishing port 80 of a webserver, notice the remote address when accessing the port via IPv4 and IPv6:

% docker run -p 80:80 nginx - - [12/Dec/2018:07:38:19 +0000] "GET / HTTP/1.1" 200 612 - - [12/Dec/2018:07:38:40 +0000] "GET / HTTP/1.1" 200 612

The first request (IPv4) has the correct remote address, but not the second one (IPv6). This is because Docker publishes ports with NAT for IPv4, and a TCP proxy for IPv6.

Of course, not having access to the actual remote address breaks rate limiting, abuse blocking, address-based geo location, etc.

Some people resort to using Docker’s host network option, but that’s not a good solution: your container will not be able to talk to other containers by name anymore, so you will need lots of static, host-specific configuration.

A better solution is to only publish the port via IPv4 and connect to the container’s IPv6 address directly:

% docker run --publish --name nginx nginx

You can obtain the container’s IPv6 address using:

% docker inspect -f '{{.NetworkSettings.GlobalIPv6Address}}' nginx

Static IPv6 addresses

Above, I explained that we need to use the container’s IPv6 address directly, and that the address is derived from the MAC address, which is chosen sequentially at container start time.

Having addresses depend on the order in which containers come up isn’t a robust solution for my simple setup, where I want to statically configure a DNS record.

Docker allows specifying an IPv6 address, but only when you’re using a user-defined bridge network with an IPv6 subnet carved out for the network, like so:

% docker network create --subnet 2001:db8:13b:330:dd::/80 --ipv6 nginx

% docker run \
  --network nginx \
  --ip6 2001:db8:13b:330:dd:ff::1 \
  --publish \

Note that I’m using an IPv6 address from the far end of the address space (ff::1), so as to not conflict with the addresses that Docker sequentially allocates from the network we created.

Now, create a DNS record with the container’s addresses and you’ll be able to access it via IPv4 and IPv6 with correct remote addresses, while still being able to reach other containers: A AAAA 2001:db8:13b:330:dd:ff::1

Note that all other Docker containers that you want to reach from the nginx container must also use the nginx network. This is the recommended solution over the old --link flag anyway.

One disadvantage of this solution is that you cannot offer services from multiple Docker containers on the same IPv6 address (e.g. www and git).

12. December 2018 08:54:00

15. November 2018

Quantum chemistry with Coulomb Sturmians: Construction and convergence of Coulomb Sturmian basis sets at Hartree-Fock level

This week we submitted a paper related to our ongoing work regarding the use of Coulomb Sturmian basis functions in electronic structure theory. Given the recent progress, which has been made with regards to the evaluation of molecular integrals based on these exponential-type functions, molecular calculations based on Coulomb Sturmians are within reach. This is a very promising prospect due to the abilities of the Coulomb Sturmians to represent the features of the wave function both at the nucleus as well as at large distances.

Our work takes a first look at the convergence of Coulomb Sturmian discretisations in electronic structure theory, expanding on ideas suggested in my PhD thesis. We suggest a simple construction scheme for Coulomb Sturmian basis sets and discuss its properties. A key aspect is to connect the basis set parameters to physical features of the wave function or other chemically intuitive quantities such as the exponents obtained by Clementi and Raimondi, i.e. the effective nuclear charge. The abstract of the paper reads

The first discussion of basis sets consisting of exponentially decaying Coulomb Sturmian functions for modelling electronic structures is presented. The proposed basis set construction selects Coulomb Sturmian functions using separate upper limits to their principle, angular momentum and magnetic quantum numbers. Their common Coulomb Sturmian exponent is taken as a fourth parameter. The convergence properties of such basis sets are investigated for second and third row atoms at the Hartree-Fock level. Thereby important relations between the values of the basis set parameters and the physical properties of the electronic structure are recognised. For example, an unusually large limit for the angular momentum quantum number in unrestricted Hartree-Fock calculations can be linked to the breaking of spherical symmetry in such cases. Furthermore, a connection between the optimal, i.e. minimum-energy, Coulomb Sturmian exponent and the average Slater exponents values obtained by Clementi and Raimondi (E. Clementi and D. L. Raimondi, J. Chem. Phys. 38, 2686 (1963)) is made. These features of Coulomb Sturmian basis sets emphasise their ability to correctly reproduce the physical features of Hartree-Fock wave functions.

by Michael F. Herbst at 15. November 2018 10:00:00

30. October 2018

sECuREs Webseite

Network setup for our retro computing event RGB2Rv18

Our computer association NoName e.V. organizes a retro computing event called RGB2R every year, located in Heidelberg, Germany. This year’s version is called RGB2Rv18.

This article describes the network setup I created for this year’s event. If you haven’t read it, the article about last year’s RGB2Rv17 network is also available.


As a reminder, the venue’s DSL connection tops out at a megabit or two, so we used my parent’s 400 Mbit/s cable internet line, like last year.

A difference to last year is that we switched from the tp-link CPE510 devices to a pair of Ubiquiti airFiber24. The airFibers are specified to reach 1.4 Gbit/s. In practice, we reached approximately 700 Mbps displayed capacity (at a signal strength of ≈-60 dBm) and 422 Mbit/s end-to-end download speed, limited by the cable uplink.

Notably, using a single pair of radios removes a bunch of complexity from the network setup as we no longer need to load-balance over two separate uplinks.

Like last year, the edge router for the event venue was a PC Engines apu2c4. For the Local Area Network (LAN) within the venue, we provided a few switches and WiFi using Ubiquiti Networks access points.

WiFi setup

It turns out that the 24 GHz-based airFiber radios are much harder to align than the 5 GHz-based tp-links we used last year. With the tp-link devices, we were able to easily obtain a link, and do maybe 45 minutes of fine tuning to achieve maximum bandwidth.

With the airFiber radios mounted in the same location, we were unable to establish a link even once in about 1.5 hours of trying. We think this was due to trees/branches being in the way, so we decided to scout the property for a better radio location with as much of a direct line of sight as possible.

We eventually found a better location on the other side of the house and managed to establish a link. It still took us an hour or so of fine tuning to move the link from weak (≈-80 dBm) to okay (≈-60 dBm).

After the first night, in which it rained for a while, the radios had lost their link. We think that this might be due to the humidity, and managed to restore the link after another 30 minutes of re-adjustment.

It also rained the second night, but this time, the link stayed up. During rain, signal strength dropped from ≈-60 dBm to ≈-72 dBm, but that still resulted in ≈500 Mbit/s of WiFi capacity, sufficient to max out our uplink.

For next year, it would be great to use an antenna alignment tool of sorts to cut down on setup time. Alternatively, we could switch to more forgiving radios which also handle 500 Mbps+. Let me know if you have any suggestions!


In May this year, I wrote router7, a pure-Go small home internet router. Mostly out of curiosity, we gave it a shot, and I’m happy to announce that router7 ran the event without any trouble.

In preparation, I implemented TCP MSS clamping and included the WireGuard kernel module.

I largely followed the router7 installation instructions. To be specific, here is the Makefile I used for creating the router7 image:

# without dhcp6,
# as the cable uplink does not provide IPv6:
PKGS := \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

ifndef DIR
	@echo variable DIR unset
	GOARCH=amd64 gokr-packer \, \ \ \
		-overwrite_boot=${DIR}/boot.img \
		-overwrite_root=${DIR}/root.img \
		-overwrite_mbr=${DIR}/mbr.img \
		-serial_console=ttyS0,115200n8 \
		-hostname=rgb2router \

After preparing an interfaces.json configuration file and a breakglass SSH hostkey, I used rtr7-recover to net-install the image onto the apu2c4. For subsequent updates, I used rtr7-safe-update.

The Go packages under are wrappers which run software I installed to the permanent partition mounted at /perm. See gokrazy: Prototyping for more details.

Tunnel setup

Last year, we used a Foo-over-UDP tunnel after noticing that we didn’t get enough bandwidth with OpenVPN. This year, after hearing much about it, we successfully used WireGuard.

I found WireGuard to be more performant than OpenVPN, and easier to set up than either OpenVPN or Foo-over-UDP.

The one wrinkle is that its wire protocol is not yet frozen, and its kernel module is not yet included in Linux.

Traffic shaping

With asymmetric internet connections, such as the 400/20 cable connection we’re using, it’s necessary to shape traffic such that the upstream is never entirely saturated, otherwise the TCP ACK packets won’t reach their destination in time to saturate the downstream.

While the FritzBox might already provide traffic shaping, we wanted to voluntarily restrict our upstream usage to leave some headroom for my parents.

rgb2router# tc qdisc replace dev uplink0 root tbf \
  rate 16mbit \
  latency 50ms \
  burst 4000

The specified latency value is a best guess, and the burst value is derived from the kernel internal timer frequency (CONFIG_HZ) (!), packet size and rate as per

Tip: keep in mind to disable shaping temporarily when you’re doing bandwidth tests ;-).


  • We peaked at 59 active DHCP leases, which is very similar to the “about 60” last year.

  • DNS traffic peaked at about 25 queries/second, while mostly remaining at less than 5 queries/second.

  • We were able to obtain peaks of nearly 200 Mbit/s of download traffic and transferred over 200 GB of data, twice as much as last year.

30. October 2018 22:00:00

18. October 2018

Mero’s Blog

Using roughtime as a "cryptographic notary"

tl;dr: Roughtime can be (ab)used for Trusted Timestamping. I wrote a simple tool as a PoC

Recently, Cloudflare announced that they are now running a roughtime server. Roughtime is a cryptographically secured time-synchronization protocol - think NTP with signatures. For an actual description of how it works, I recommend reading the Cloudflare blog post. But at it's very core (oversimplification ahead), the user chooses an arbitrary (usually randomly generated) nonce and the server signs it, plus the current time.

One thing roughtime adds on top of this, is the ability to build a chain of requests. This is achieved by taking a hash of a response, combining it with a randomly generated "blind" and using the combined hash as a nonce to the next request. The intended use-case of this is that if a server provides the wrong time or otherwise misbehaves, you can obtain cryptographic proof of that fact by getting a timestamped signature of its response from a different server. By storing the initial nonce, generated blinds and responses, the entire chain can be validated later.

When I saw Cloudflares announcement, my first thought was that it should be possible to use a roughtime server as a Time Stamping Authority. The goal is, to obtain a cryptographic proof, that you owned a particular document at the current point in time - for example to ensure you can proof original authorship without publishing the document itself.

The simplest way to achieve this using roughtime is to use the SHA512 hash of the file as an initial nonce. That way, the roughtime server signs that hash together with the current time with their private key. By using the roughtime chain protocol, you can get that proof corroborated by multiple servers.

You can also think of extending this, to get stronger properties. Using the hash of the file as a nonce only proves that the file existed before that specific point in time. It also doesn't actually prove that you had the file, but only the hash. This can be remediated though. If we run a regular roughtime request, the resulting response is unpredictable (to us) and signs the current time. Thus, if we use a hash of that response as a prefix "salt" of the file itself, the resulting hash proofs that we knew the file after that chain ran. We can then use that hash as a nonce for another roughtime chain and get a proof that we had the file at a specific point (or rather, a small interval) in time. Furthermore, we can opt to use the file-hash not as the nonce itself, but as a blind. The advantage is, that the blind is never transmitted over the network, so the actual proof is only available to us (if we use it as a nonce, an eavesdropper could intercept the proof). I illustrated these options in a recent talk I gave on the subject.

These ideas are mostly academic. I'm not sure how useful these properties are in practice. Nevertheless, the idea intriguiged me enough to implement it in a simple tool. It's in a pretty rough, proof-of-concept like shape and I don't know if I will ever progress it from there. It also comes with a client implementation of the roughtime protocol in Go - initially I was not aware that there already was a Go implementation, but that also is not go-gettable. Either way, it was fun to implement it myself :)

18. October 2018 23:22:40

02. October 2018


Ontbyte – Niederländisches Spätstück im RZL

Ontbyte Kunst

Vla’ntastische Neuigkeiten für alle Spätstücker und Frikandel-Fans!

Am Sonntag, den 14.Oktober 2018, ist der Specu’loos! Ab 11.30 Uhr treffen wir uns zum gemeinsamen Ontbyte im RaumZeitLabor.

Diesmal möchten wir unter dem Motto “Programmeren en Degusteren” unseren niederländischen Nachbarn mit allerhand kulinarischen Spezialitäten huldigen.

Egal ob ihr mit Hagelslag in den Tag starten wollt oder eher Spaß mit Pindakaas auf eurer morgendlichen To-Do-Liste steht - Für alle wird etwas dabei sein!

Ihr wollt mitessen? Kein Problem – Damit wir wissen, wie viele Gehaktballen wir rollen müssen, freuen wir uns über Rückmeldung per Mail bis zum 7. Oktober 2018. Pro Kopf bitten wir um die Einzahlung eines Beitrags von 10 Euro in die Kaas-Kassa.

Eet Smakelijk! Eure Frau Antje

by flederrattie at 02. October 2018 00:00:00

30. September 2018

Simulating chemistry: Enabling novel approaches for modelling the electronic structure of molecules

Last week Wednesday I was invited to present my research in front of a bunch of young researchers from the Heidelberg Laureate Forum visiting our institute. Whilst I would have loved to attend the full week of the forum, I am happy that in this way I was able to contribute a little to this meeting and participate in discussions with other participants.

Instead of too much going into the details of my work, I decided to provide a general overview of modern quantum-chemical approaches towards modelling chemical properties and reactions. Only very briefly, at the end of my talk, I very briefly talk about the work concerning our recent molsturm paper. As usual my slides are attached below:

Link Licence
Slides Simulating chemistry Creative Commons License

by Michael F. Herbst at 30. September 2018 16:00:00

18. September 2018

Franco-German Workshop 2018: Open problems related to the algebraic-diagrammatic construction scheme

In some of my recent projects, e.g. the PE-ADC paper, I got a little more into method development surrounding the algebraic-diagrammatic construction scheme (ADC). Even though quite a few people from the Dreuw group in Heidelberg are involved with ADC development and we have regular talks on ADC, I did not yet have the chance to really take a deeper look at this method myself.

Last week I was invited to the Franco-German Workshop on mathematical aspects in computational chemistry 2018, where a selection of chemists and mathematicians from France and Germany and a few other places got together to talk about their recent advances, but most importantly to discuss. From the ADC side I had already collected a few potential angles for improvement in recent month and this occasion thus seemed ideal to both brush up on my knowledge of ADC and at the same time present on ADC.

In the spirit of such a rather hands-on-type workshop, I only quickly glanced over the current status of ADC as an excited-states method. Afterwards introduced common numerical procedures and pointed at currently open problems and challenges. From the discussions directly after the lecture as well as the remaining days of the workshop, a few interesting ideas emerged and I am already looking forward to trying them out. As per usual the slides of my talk are attached below.

Link Licence
Slides Challenges and open problems related to the algebraic diagrammatic construction scheme Creative Commons License

by Michael F. Herbst at 18. September 2018 16:00:00


Agenda-Aktion 2018 – Wir löten ein elektronisches Spielzeug

In Zusammenarbeit mit der Stadt Mannheim haben wir auch dieses Jahr wieder Lötworkshops für Kinder angeboten. An insgesamt fünf Terminen durften jeweils 12 Kinder im Alter von 8-12 Jahren den Lötkolben schwingen. Fleissige Kinder beim Löten

Als Bausatz gab es diesmal einen elektronischen Würfel. Dieses Jahr hat uns das Technologie-Beratungsunternehmen INNOQ die Bausätze gesponsort. Mit wenigen Komponenten war der Würfel sehr einfach zu bauen. Die Schaltung basiert auf einem ATTiny85, zeigt schön gleichmäßig verteilte Zahlen an und dank des Tiefschlafmodus wird nicht einmal ein Ein-Aus-Schalter benötigt. Platinenlayout, Sourcecode und 3D-Modelle stehen wie immer auf unserem Github zur Verfügung: Der Würfel auf Github Der fertige Würfelbausatz

Den selbstgebauten Würfel durften alle Kinder gleich einmal bei einer Runde Mensch-ärgere-Dich-nicht ausprobieren. Brettspiel mit elektronischem Würfel

by Pfeyffer at 18. September 2018 00:00:00

09. September 2018

MRMCD18: Pitfalls for performance: Latencies to keep in mind

This weekend I attended the MRMCD18 in Darmstadt, which is one of the many chaos events where a few hundred people interested in computer science, hacking, making, politics and society gather for a weekend of talks, coding and discussion. Having participated in this event regularly in the past few years, I was very happy that this year I was given the chance to present a lecture.

I chose to talk about lazy matrices and contraction-based algorithms. Even though this topic is discussed briefly in my PhD thesis, I typically keep it rather short in my typical talks. Similarly the molsturm design paper was only brief with respect to this subject. In this talk I therefore took the chance to expand on contraction-based methods a bit further, tackling the matter by taking a look at latency numbers in modern hardware and from there drawing conclusions with respect to numerical linear algebra.

Thanks to the great C3VOC team the talk was recorded live and can be watched on or downloaded below, along with my slides.

Note: Unfortunately the batteries of the microphone gave up in the middle of my talk, around 25:20. However, if you seek the video forward to minute 26:17, I switch to a handheld and audio is back.

Link Licence
Recording Pitfalls for performance:
Latencies to keep in mind
Creative Commons License
Slides Pitfalls for performance Creative Commons License

by Michael F. Herbst at 09. September 2018 18:00:00

05. September 2018

Mero’s Blog

Scrapping contracts

tl;dr: I describe a way to simplify the generics design. The ideas are not particularly novel and have been expressed to various degrees by other people as well. I hope to provide a more complete view of the design though.

Recently a Problem Overview and Draft Design for generics in Go have dropped. Since then, predictably, there has been a bunch of chatter on the intertubez about it. This is a summary of my thoughts, so far, on the subject - after a bunch of discussions on Twitter and Reddit.

Note: The design is called "Contracts", but I will refer to it as "the design doc" here. When I say "contracts", I will refer to the specific part of the design to express constraints.

Contracts vs. Interfaces

First, there is a common observation of overlap between generics and interfaces. To untangle that, we can say that when we use "generics", what we mean is constrained parametric polymorphism. Go already allows polymorphism by using interfaces. This desgn doc adds two things: One, a way to add type-parameters to functions and types. And two, a syntax to constrain those type-parameters to a subset that allows specific operations, via contracts.

The latter is where the overlap lies: Interfaces already allow you to constrain arguments to types that allow certain operations. In a way, what contracts add to this, is that those operations can not only be method calls, but also allow (and constrain) builtin operators and functions to be used and to allow or disallow certain composite types (though that mainly affects map).

Contracts allow that by the way they are specified: You write a function body (including arguments, whose notational type becomes the type-variable of the contract) containing all the statements/expressions you wish to be able to do. When instantiating a generic type/function with a given set of type-arguments, the compiler will try to substitute the corresponding type-variable in the contract body and allow the instantiation, if that body type-checks.

The cost of contracts

After talking a bit through some examples, I feel that contracts optimize for the wrong thing. The analogy I came up with is vocabulary vs. grammar.

The contracts design is appealing to a good degree, because it uses familiar syntax: You don't have to learn any new syntax or language to express your contract. Just write natural Go code and have that express your constraints for you. I call this the "grammar" of constraints: The structure that you use to input them.

On the other hand, for the user of Go, the relevant question is what constraints are possible to express and how to express them. They might be interested in deduplicating values in their algorithm, which requires equality-operations. Or they might want to do comparisons (e.g. Max), which requires >. I call this the vocabulary: What is the correct way to express the set of constraints that my algorithm needs?

The issue now, is that while the grammar of constraints might be obvious, it is not always clear what the actual semantic constraints that generates are. A simple example is map-keys. The design doc uses the contract

contract comparable (t T) {
   t == t

to specify types that are valid map-keyes. But to a beginner, it is not immediately obvious, what comparisons have to do with maps. An alternative would be

contract mapkey (t T) {
  var _ map[t]bool

But which is better? Similarly, these two contracts

contract mult (t T) {
  t = t * t

contract add (t T) {
  t = t + t

seem very similar, but they are, in theory at least, fundamentally different. Not only because add allows string, while mult doesn't. But also, because technically any type that supports * also supports - and /. And then there's

contract div (t T) {
  t = t % t

which creates another completely different set of types and allowed operators.

A third example is

contract stringlike (t T) {
  append([]byte(nil), t...)

This allows any type with underlying type string or []byte, but nothing else. And again, technically that would imply allowing index-operations and len. But does the compiler understand that?

Lastly, it's not really clear how len, cap, make or range would work. For example, all these contracts are superficially valid:

contract rangeable (t T) {
  for x := range t {

contract lengthed (t T) {
  var _ int = len(t)

contract capped (t T) {
  var _ int = cap(t)

contract makeable (t T) {
  t = make(T)

contract makeable2 (t T) {
  t = make(T, 0)

But in all these cases, they allow some subset of channel, map, slice and array types, with vastly different interpretations of these operations, depending on the kind of type used - to the degree, that code using them would usually be nonsensical. Disallowing these, however, opens questions about the claim of familiar Go syntax, as we now have to make decisions what sort of expressions and statements we do or don't allow in a contract.

This is why I say contracts optimize for grammar, instead of vocabulary. The programmer is interested in the vocabulary - what does the contract actually mean and what contract should they use? But the vocabulary is obscured by the grammar - because we use Go syntax, to understand a given contract we need to know a bunch of things about what the compiler is and is not able to infer from it.

This is why I don't really buy the argument of not wanting to learn a bunch of new syntax or new identifiers for constraints: You still have to learn that vocabulary, but you express it in an obscure and unnatural grammar. I hope to show that we can introduce the power of generics while also using familiar grammar and with minimal addition of vocabulary.

Scrapping contracts

Now, I'm not the first person to suggest this, but I think we should consider scrapping contracts from the design. We can still retain type-parameters and we can still have constraints, but we express them via interfaces instead. I should point out, that - for now - I'm intentionally optimizing for simplicity of the design, at the cost of some boilerplate and some loss of power. I will later try and provide some alternatives to compensate for that in part. But there is still likely going to remain a net cost in expressiveness. Personally, I think that tradeoff is worth exploring.

The new design would retain type-parameters and most of their syntax. The difference is that type-parameters are a full argument list. The type of an argument has to be an interface type. It can be ellided, in which case it defaults to the type of the following type-parameter. The last type-parameter defaults to interface{}. As a bonus, this allows providing multiple sets of constraints on one declaration:

func Map(type A, B) (s []A, f func(A) B) []B {
  var out []B
  for _, a := range s {
    out = f(a)
  return out

func Stringify(type A fmt.Stringer) (s []A) []string {
  // Because of the signature of fmt.Stringer.String, we can infer all the
  // type-arguments here. Note that A does not *have* to be passed boxed in an
  // interface. A.String is still a valid method-expression for any fmt.Stringer.
  return Map(s, A.String)

We still want to be able to express multiple, interdependent parameters, which we can, via parametric interfaces:

type Graph(type Node, Edge) interface {
  Nodes(Edge) []Node
  Edges(Node) []Edge

func ShortestPath(type Node, Edge) (g Graph(Node, Edge), from, to Node) []Edge {
  // …

// Undirected Graph as an adjacency list. This could be further parameterized,
// to allow for user-defined paylooads.
type AdjacencyList [][]int

func (g AdjacencyList) Nodes(edge [2]int) []int {
  return edge[:]

func (g AdjacencyList) Edges(node int) [][2]int {
  var out [][2]int
  for _, v := range g[node] {
    out = append(out, [2]int{node, v}
    if v != node {
      out = append(out, [2]int{v, node})
  return out

func main() {
  g := AdjacencyList{…}
  // Types could be infered here, as the names of methods are unique, so we can
  // look at the methods Nodes and Edges of AdjacencyList to infer the
  // type-arguments.
  path := ShortestPath(g, 0, len(g)-1)

The last example is relevant to the difference in power between contracts and interfaces: Usage of operators. We can still express the concept, but this is where the increased boilerplate comes in:

func Max(type T)(a, b T, less func(T, T) bool) T {
  if less(a, b) {
    return b
  return a

func main() {
  fmt.Println(Max(a, b int, func(a, b int) { return a < b }))

I will try to show some ways to get rid of that boilerplate later. For now, let's just treat it as a necessary evil of this idea. Though it should be mentioned, that while this is more cumbersome, it's still just as typesafe as contracts (as opposed to, say, a reflect-based generic Max).

So, scrapping contracts leaves us with more boilerplate, but just the same set of concepts we can express - though we do have to pass in any builtin operations we want to perform as extra functions (or express them in an interface). In exchange, we get

  • Only one way to specify constraints.
  • A simpler spec (we don't need to add a new concept, contracts, to the language) and a saved (pseudo-)keyword.
  • A simpler compiler: We don't need to add a solver to deduce constraints from a given contract. The constraint-checker already exists.
  • Still a well-known, though less powerfull, language to express constraints, with interfaces.
  • Simple syntax (same as normal arglists) for having multiple sets of constraints in one declaration.
  • Trivially good error messages. Types passed in need only be checked for consistency and interface satisfaction - the latter is already implemented, including good error messages.

Getting rid of boilerplate

I see two main ways to get rid of boilerplate: Adding methods to builtin types, or what I call pseudo-interfaces.

Methods on builtin types

An obvious idea is to not use operators in generic code, but instead use method-call syntax. That is, we'd do something akin to

func Max(type T Ordered) (a, b T) T {
  if a.Less(b) {
    return b
  return a

To actually reduce the boilerplate, we'd predefine methods for all the operators on the builtin types. That would allow us to call Max with int, for example.

Unfortunately, I can see a bunch of roadblocks to make this work. Methods are not promoted to derived types, so you couldn't use Max with e.g. time.Duration, which has underlying type int64, but is not the same type. We'd probably want those methods to be "special" in that they automatically get promoted to any type whose underlying type is predeclared. That introduces compatibility issues of clashing Method/Field names.

At the end, to express that Less has to take the same argument as the receiver type, Ordered might look something like this:

type Ordered(T) interface {
  Less(T) bool

func Max(type T Ordered(T)) (a, b T) T {
  if a.Less(b) {
    return b
  return a

// In the universe block:

// Implements Ordered(int).
func (a int) Less(b int) bool {
  retun a < b

Though it's not clear, whether a parameter like T Ordered(T) should be allowed. And this would technically allow to implement Ordered(int) on a custom type. While that probably won't be very useful (the majority of usecases will require T Ordered(T)), it's not excluded.


Unfortunately I didn't have a lot of time the last couple of days, so I got beat to the punch on this. Matt Sherman described the idea first and called the concept "typeclasses". I will stick with pseudo-interface, because it fits better in the general concept of this description.

The idea is to introduce a set of types into the language that can be used like interfaces (including embedding), but instead of providing methods, provide operators. There is a limited set of base types that need to be provided:

pseudo-interface | Allowed operators
comparable       | ==, !=
ordered          | <, <= > >=
boolean          | ||, &&, !
bitwise          | ^, %, &, &^, <<, >>
arith            | +, -, *, /
concat           | +
complex          | real(z), imag(z)
nilable          | v == nil

and a set of derived pseudo-interfaces:

pseudo-interface | definition
num              | interface { comparable; ordered; arith }
integral         | interface { num; bitwise }
stringy          | interface { comparable; ordered; concat; len() int }
iface            | interface { comparable; nilable }

The pseudo-interfaces would be declared in the universe block, as predeclared identifiers. This makes them backwards-compatible (as opposed to methods on builtin types), because any existing identifier would just shadow these (akin to how you can have a variable with name string).

Bitshift-operators currently are restricted when used with constants overflowing the width of an integral type. For generic code, this restriction would be lifted (as the size is not statically known) and instead the behavior is equivalent to if the right operand is an uint variable with the given value.

This would allow us to write

func Max(type T ordered) (a, b T) T {
  if a < b {
    return b
  return a

Notably, the list of pseudo-interfaces doesn't include anything related to channel-, slice- or map-operations (or other composite types). The idea is to instead use a type literal directly:

type Keys(type K, V) (m map[K]V) []K {
  var out []K
  for k := range m {
    out = append(out, k)
  return out

As every type supporting, e.g. map operations, need to have underlying type map[K]V, it's thus assignable to that type and can be passed to Keys as is. That is, this is completely legal:

func main() {
  type MyMap map[string]int
  var m = MyMap{
    "foo": 23,
    "bar": 42,

This also solves another problem with contracts: The ambiguity of len, cap and range. As the actual kind of the value is not only known during compilation of the generic function, but even obvious from the code, there is no question about the intended semantics.

Should Go ever grow operator overloading via operator methods, the pseudo-interfaces could be changed into actual interfaces, containing the necessary methods. Of course, that implies that operator overloading would retain the properties of existing operators, e.g. that having == implies having !=, or having - implying having +. Personally, I consider that a good thing - it limits the abuse of operator overloading for nonsensical operations (say, << for writing to an io.Writer).

I'm not trying to advocate for operator overloading, but think it's worth mentioning that this design leaves the door open to that.

But performance

A possible criticism of either of these approaches is, that operators have better performance than dynamic dispatch to a method. I believe (vigorous handwaving ahead) that this is no different in the existing contracts proposal. If generic code is compiled generically, it still needs to employ some means of dynamic dispatch for operators. If, on the other hand, it's compiled instantiated, then the compiler would also be able to devirtualize the interfaces - and then inline the method definition.


I've previously said that I'm "meh" on the design doc, which is the strongest form of endorsement a generics proposal could ever get from me. After some discussion, I'm more and more convinced that while contracts seem conceptually simple, they create a plethora of implementation- and usage questions. I'm not sure, the supposed advantage of contracts, of a well-known syntax, holds up to scrutiny when it comes to mapping that to the actually derived constraints or writing contracts. There are also many open questions in regards to contracts, a bunch of them related to the ambiguity of Go-expressions. As a result, I'm starting to feel more negative towards them - they look like an elegant idea, but in practice, they have a lot of weird corners.

This design is similar (AIUI) to the type functions proposal, so I assume there are good reasons the Go team does not want this. The difference is mainly the absence of operator methods in favor of pseudo-interfaces or explicit method calls. This design also handwaves a couple of important implementation questions - the justification for that is that these questions (e.g. type inference and code generation) should be able to be taken from the design doc with minimal changes. It's entirely possible that I am overlooking something, though.

05. September 2018 04:00:00

28. August 2018

MES 2018 in Metz: molsturm Towards quantum-chemical method development for arbitrary basis functions

Currently I am attending the Molecular Electronic Structure 2018 conference in Metz, an electronic-structure theory conference intended for researchers working on novel methodologies, such as explicit correlation, density matrix theory, stochastic approaches or on discretisation techniques going beyond Gaussian-type orbitals.

Especially the latter aspect fits well to the kind of research direction we want to support with our molsturm program package. Thanks to the organisers of the conference, namely Ugo Ancarani and his committee, I was able to present the design and ideas of molsturm in a short contributed talk. The talk essentially became a summary of our molsturm design paper, which just got published.

Unfortunately 15 minutes are not enough to explain everything in detail, so I had to glance over many aspects in my talk. The focus ended up being mainly our basis-function independent, contraction-based self-consistent field scheme, which sits at molsturm's heart, bringing together integral libraries for various basis function types and Post-HF routines. I furthermore briefly hinted at molsturm's lightweight structure with easy-to-use interfaces that allow to combine existing integral and Post-HF codes with molsturm. The driving force here is to avoid implementing all quantum-chemical methods from scratch for each new basis function type and much rather use what codes already exists. As I outlined in my talk, a basis-function independent SCF code is a key step to achieve this.

I am thankful for all the questions and comments I received during the talk and the conference so far. As per usual the slides are attached below.

Link Licence
Slides molsturm: Towards quantum-chemical method development for arbitrary basis functions Creative Commons License

by Michael F. Herbst at 28. August 2018 16:00:00

08. August 2018

Embedding Combined with the Algebraic Diagrammatic Construction: Tackling Excited States in Biomolecular Systems

Yesterday the journal of chemical theory and computation accepted our manuscript titled Polarizable Embedding Combined with the Algebraic Diagrammatic Construction: Tackling Excited States in Biomolecular Systems. In this work we describe the combination of the polarisable embedding (PE) scheme for including environmental effects with the algebraic-diagrammatic construction (ADC) approach.

The main idea of polarisable embedding is to model the effect an environment has on a core system by means of modelling the classical electrostatic interactions between both. For this the potential generated by the electron density of the environment is expanded in a multipole series, which is typically truncated at the level of dipoles. Furthermore effect of the core on the environment itself is taken into account by allowing for induced dipoles in the environment next the aforementioned static dipoles of the multipole expansion. Instead of modelling the full details, the environment is thus reduced to a set of static dipole moments and polarisabilities, which are located at some predefined sites, typically the position of the nuclei. During the SCF procedure of the core region the dipoles induced by the core are optimised self-consistently along the SCF, such that after the SCF procedure a consistent electrostatic potential of the environment is included inside the core density.

The reduction of the environment to a set of dipoles and polarisabilities has the advantage, that these parameters can be determined a priori based on a comparatively cheap computational method and a standardised fitting procedure (e.g. LoProp). This implies that one can reduce the size of the system, which needs to be modelled by a more expensive method. One example would be the algebraic-diagrammatic construction scheme for the polarisation propagator for computing so-called excited states, which we connect with polarisable embedding in this work. In order to achieve this properly one has to do a little more than self-consistently finding the electrostatic embedding potential. Namely, one has to keep in mind that an electronic excitation changes the electronic structure of the core, thus changes the polarisation on the embedding and thus again the electronic structure of the core. Taking this effect fully into account would require another self-consistent iteration back and forth between ADC and the PE scheme. One can avoid this by treating this effect only approximately via linear response as well as perturbative corrections on top of an ADC calculation.

Apart from describing these corrections and their connections to physical quantities, we also hint at another important aspect of the simulation procedure, namely finding sensible arrangements for the environment atoms. Since the environment often consists of a solvent or a protein, it tends to be much more flexible compared to the core system. For this reason one cannot merely pick a single random molecular arrangement for the environment, but one needs to sample and perform the full calculation (that is ADC on the core) multiple times in order to gain some statistical results, which are a meaningful model for the exact physical system.

Last but not least we demonstrate the applicability of PE-ADC using three examples, namely a water cluster environment around para-nitroaniline, lumiflavin in bulk water and lumiflavin inside a dodecin protein. For each of the cases a detailed analysis of the results and comparison with existing methods is presented including statistical analysis of the sampling employing box plots.

The full abstract of the paper reads as follows:

We present a variant of the algebraic diagrammatic construction (ADC) scheme by combining ADC with the polarizable embedding (PE) model. The presented PE-ADC method is implemented through second and third order and is designed with the aim of performing accurate calculations of excited states in large molecular systems. Accuracy and large-scale applicability are demonstrated with three case studies, and we further analyze the importance of both state-specific and linear-response-type corrections to the excitation energies in the presence of the polarizable environment. We demonstrate how our combined method can be readily applied to study photo-induced biochemical processes as we model the charge-transfer (CT) excitation which is key to the photoprotection mechanism in the dodecin protein with PE-ADC(2). Through direct access to state-of-the-art excited state analysis, we find that the polarizable environment plays a decisive role by significantly increasing the CT character of the electronic excitation in dodecin. PE-ADC is thus suited to decipher photoinduced processes in complex, biomolecular systems at high precision and at reasonable computational cost.

by Michael F. Herbst at 08. August 2018 21:00:00

14. July 2018


Spiralgalaxien, Sterne, Staunen – Ein Bericht unserer astronomischen Führung

Vergangene Woche haben wir uns auf den Weg zum Heidelberger Königstuhl gemacht. Neben dem Haus der Astronomie und dem Max-Planck-Institut für Astronomie befindet sich dort oben mit der Landessternwarte Königstuhl ein Teil der größten universitären Einrichtung für astronomische Forschung und Lehre Deutschlands.

Der erste Teil unserer Astrotour brachte uns zu einem über hundert Jahre alten Teleskop in einer der Kuppeln der 1898 eröffneten Landessternwarte. Nachdem wir händisch das Teleskop Richtung Sonne gedreht hatten, durften die Pyromanen unter uns Experimente mit Papier und Lichtbündelung unter Linsen machen.

Im Kuppelprojektionsauditorium des HdA reisten wir mit Hilfe des digitalen Planetariumssystems virtuell durch das Universum, statteten Mark Watney einen Besuch auf dem Mars ab und ließen uns alle unsere Fragen zum Thema Astronomie und Forschung beantworten.

Der letzte Teil unserer Tour führte uns ins Max-Planck-Institut für Astronomie. Dort konnten wir den »Sternenraum« bestaunen – ein von zwei Masterstudenten gebautes 3D-Modell eines Stück Universums. Dargestellt durch über 100 LEDs hängen Sterne und Galaxien an feinen Drähten im Raum. Zum Abschluss unserer Führung durften wir als Vergleich zum Landessternwarten-Teleskop noch ein moderneres 70cm Spiegelteleskop in der Ostkuppel des MPIA bewundern.

Unser Dank für diesen spannenden Nachmittag gilt Stefan, unserem Tourguide und der Presse- und Öffentlichkeitsarbeit des HdA und des MPIA.

Astrotour Impressionen

by flederrattie at 14. July 2018 00:00:00

24. June 2018

Coulomb Sturmians in electronic structure theory (Poster at 16th ICQC)

Last week from 18th till 23rd June I attended the 16th International Congress of Quantum Chemistry (ICQC) in Menton, France. Apart from the absolutely stunning scenery at the French Riviera as well as the historic conference building, the conference was a great opportunity to meet up and discuss with people from the community. Often the discussions continued in one of the plenty of bars in the old town or right next to the beach. Even though the program was pretty packed, I fortunately still had the chance to walk around and enjoy the Mediterranean atmosphere in Menton as well as the sun and the sea.

During the conference I presented a poster titled Coulomb Sturmians in electronic structure theory, where I discussed my recent convergence results on Coulomb Sturmians, just recently published in my PhD thesis. The poster pdf is attached below.

Link Licence
Poster Coulomb Sturmians in electronic structure theory Creative Commons License

by Michael F. Herbst at 24. June 2018 22:00:00

16. June 2018


Astronomische Führung auf dem Königstuhl

Liebe RaumZeitLaborierende, Hobbyastronomen und Gelegenheits-Sternegucker!

Ihr interessiert euch für Kosmologie und wolltet schon immer mal in der Spiralgalaxie M51 eine Runde drehen? Oder zumindest in einem Haus, was dieser nachempfunden ist?

Kein Problem – am Montag, den 9. Juli habt ihr die Möglichkeit dazu! Ab 16 Uhr werden wir uns für zwei Stunden auf dem Königstuhl in Heidelberg vollständig der Faszination Astronomie hingeben.

Wir werden das spiralförmige Haus der Astronomie, das Max-Planck-Institut für Astronomie und eventuell auch die Landessternwarte besichtigen. Im Ganzkuppelprojektionsauditorium werden wir bis an den Rand des Universums reisen, wir werden uns über neuste Forschungsaktivitäten informieren und größere Teleskope sowie Modelle aus nächster Nähe anschauen.

Diese Exkursion ist kostenlos und selbstverständlich offen für Noch-nicht-RaumZeitLabor-Mitglieder. Wenn ihr dabei sein wollt, schickt mir bis zum 25. Juni eine elektronische Sternschnuppe mit eurem Namen und den Namen eurer Begleiter. Bitte beachtet: Die Führung kann nur stattfinden, wenn mindestens 12 weltraumforschungsbegeisterte Personen teilnehmen!

Ex astris, scientia


by flederrattie at 16. June 2018 00:00:00

13. June 2018

Atsutane's Blog

Postfix: Reduzieren von eingehenden Spam-Mails

Der SMTP Server Postfix bietet sehr umfangreiche Möglichkeiten zur Reduzierung der Anzahl von Spam-Mails die es bis zum User-Account schaffen. Dieser Artikel befasst sich mit dem Szenario wie man bei verschiedenen Diensten im Internet eine spezielle E-Mail Adresse verwenden und nach einem Leak schon durch Postfix an diese Adresse gesendete E-Mails verwerfen lassen kann.

Mit der Einstellung recipient_delimiter = - in der /etc/postfix/ entsteht die Möglichkeit E-Mail Adressen nach dem Schema user-foobar@domain.tld zu verwenden. E-Mails welche an diese Adresse gesendet werden, landen im Postfach von user, das -foobar in diesem Schema wird an dieser Stelle ignoriert. Die E-Mail selbst verliert dies in ihrem Empfängerfeld jedoch nicht. Dies ermöglicht es mit den meisten E-Mail Clients recht unkompliziert eingehende E-Mails von unterschiedlichen Accounts, welche man im Netz hat, zu filtern. Die E-Mails können automatisch in gesonderte Ordner im Postfach verschoben oder direkt gelöscht werden.

Angenommen man registriert sich beim Onlineshop https://alles-viel-zu-teuer.tld und verwendet als E-Mail Adresse user-vielzuteuer@domain.tld. Nun lassen sich im E-Mail Client mit Filterregeln an diese Adresse gesendete E-Mails wie z.B. Bestellbestätigungen direkt in einen separaten Ordner verschieben. Kommt es nun dazu, dass diese E-Mail Adresse nicht länger allein dem Unternehmen hinter dem Onlineshop bekannt ist, dauert es ja sicherlich nicht lange bis die ersten Spam-Mails ankommen.

Um gegen dieses Problem vorzugehen muss man im Onlineshop die E-Mail Adresse des Accounts ändern, zum Beispiel zu user-vielzuteuernachdemleak@domain.tld. Anschließend kann man auf Server Ebene E-Mails, welche an die ursprüngliche Adresse gesendet werden, umgehend verwerfen. Hierzu legt man eine Textdatei /etc/postfix/reject_rules an und fügt solch eine Regel ein:

/^\s*To:.*user-vielzuteuer@/ DISCARD

DISCARD legt fest, dass eingehende E-Mails, welche an Max Mustermann user-vielzuteuer@domain.tld, user-vielzuteuer@domain.tld oder die Adresse in einer Liste von Empfängern gesendet wurden akzeptiert aber umgehend verworfen werden. Der sendende Server "glaubt" die E-Mail wurde zugestellt und entfernt sie aus der Liste der zu sendenden E-Mails. Alternativ kann man statt DISCARD die Aktion REJECT festlegen, die E-Mail wird abgewiesen und an die E-Mail Adresse von welcher die Spam E-Mail (angeblich) kam wird eine entsprechende E-Mail geschickt.

Hat man die Regel festgelegt lässt mit postmap -q "To: Foo Bar <user-vielzuteuer@domain.tld>" regexp:/etc/postfix/reject_rules in der Shell prüfen ob der Reguläre Ausdruck dieses Header Feld erkennt und die Regel funktioniert.

Als Nächstes fügt man in der /etc/postfix/ diese Zeile hinzu:

header_checks = regexp:/etc/postfix/reject_rules

Abschließend startet man den Postfix Daemon neu oder lädt mit dem Befehl postfix reload die Konfiguration direkt neu.

by Thorsten Töpper at 13. June 2018 18:42:00

03. June 2018

Mero’s Blog

Why doesn't Go have variance in its type system?

tl;dr: I explain what co-, contra- and invariance are and what the implications for Go's type system would be. In particular, why it's impossible to have variance in slices.

A question that comes up relatively often with Go newcomers is "why can't I pass e.g. an []int to a func([]interface{})"? In this post I want to explore this question and its implications for Go. But the concept of variance (which this is about) is also useful in other languages.

Variance describes what happens to subtype relationships, when they are used in composite types. In this context, "A is a subtype of B" means that a value of type A can always be used, where a value of type B is required. Go doesn't have explicit subtype relationships - the closest it has is assignability which mostly determines whether types can be used interchangeably. Probably the most important case of this is given by interfaces: If a type T (whether its a concrete type, or itself an interface) implements an interface I, then T can be viewed as a subtype of I. In that sense, *bytes.Buffer is a subtype of io.ReadWriter, which is a subtype of io.Reader. And every type is a subtype of interface{}.

The easiest way to understand what variance means, is to look at function types. Let's assume, we have a type and a subtype - for example, let's look at *bytes.Buffer as a subtype of io.Reader. Say, we have a func() *bytes.Buffer. We could also use this like a func() io.Reader - we just reinterpret the return value as an io.Reader. The reverse is not true: We can't treat a func() io.Reader as a func() *bytes.Buffer, because not every io.Reader is a *bytes.Buffer. So, function return values could preserve the direction of subtyping relationships: If A is a subtype of B, func() A could be a subtype of func() B. This is called covariance.

func F() io.Reader {
    return new(bytes.Buffer)

func G() *bytes.Buffer {
    return new(bytes.Buffer)

func Use(f func() io.Reader) {

func main() {
    Use(F) // Works

    Use(G) // Doesn't work right now; but *could* be made equivalent to…
    Use(func() io.Reader { return G() })

On the other hand, say we have a func(*bytes.Buffer). Now we can't use that as a func(io.Reader): You can't call it with an io.Reader. But we can do the reverse. If we have a *bytes.Buffer, we can call a func(io.Reader) with it. Thus, function arguments reverse the subtype relationship: If A is a subtype of B, then func(B) could be a subtype of func(A). This is called contravariance.

func F(r io.Reader) {

func G(r *bytes.Buffer) {

func Use(f func(*bytes.Buffer)) {
    b := new(bytes.Buffer)

func main() {
    Use(F) // Doesn't work right now; but *could* be made equivalent to…
    Use(func(r *bytes.Buffer) { F(r) })

    Use(G) // Works

So, func is contravariant for arguments and covariant for return values. Of course, we can combine the two: If A and C are subtypes of B and D respectively, we can make func(B) C a subtype of func(A) D, by converting like this:

// *os.PathError implements error

func F(r io.Reader) *os.PathError {
    // ...

func Use(f func(*bytes.Buffer) error) {
    b := new(bytes.Buffer)
    err := f(b)

func main() {
    Use(F) // Could be made to be equivalent to
    Use(func(r *bytes.Buffer) error { return F(r) })

However, func(A) C and func(B) D are incompatible. Neither can be a subtype of the other:

func F(r *bytes.Buffer) *os.PathError {
    // ...

func UseF(f func(io.Reader) error) {
    b := strings.NewReader("foobar")
    err := f(b)

func G(r io.Reader) error {
    // ...

func UseG(f func(*bytes.Buffer) *os.PathErorr) {
    b := new(bytes.Buffer)
    err := f()

func main() {
    UseF(F) // Can't work, because:
    UseF(func(r io.Reader) error {
        return F(r) // type-error: io.Reader is not *bytes.Buffer

    UseG(G) // Can't work, because:
    UseG(func(r *bytes.Buffer) *os.PathError {
        return G(r) // type-error: error is not *os.PathError

So in this case, there just is not relationship between the composite types. This is called invariance.

Now, we can get back to our opening question: Why can't you use []int as []interface{}? This really is the question "Why are slice-types invariant"?. The questioner assumes that because int is a subtype of interface{}, we should also make []int a subtype of []interface{}. However, we can now see a simple problem with that. Slices support (among other things) two fundamental operations, that we can roughly translate into function calls:

as := make([]A, 10)
a := as[0]      // func Get(as []A, i int) A
as[1] = a       // func Set(as []A, i int, a A)

This shows a clear problem: The type A appears both as an argument and as a return type. So it appears both covariantly and contravariantly. So while with functions there is a relatively clear-cut answer to how variance might work, it just doesn't make a lot of sense for slices. Reading from it would require covariance but writing to it would require contravariance. In other words: If you'd make []int a subtype of []interface{} you'd need to explain how this code would work:

func G() {
    v := []int{1,2,3}

func F(v []interface{}) {
    // string is a subtype of interface{}, so this should be valid
    v[0] = "Oops"

Channels give another interesting perspective here. The bidirectional channel type has the same issue as slices: Receiving requires covariance, whereas sending requires contravariance. But you can restrict the directionality of a channel and only allow send- or receive-operations respectively. So while chan A and chan B would not be related, we could make <-chan A a subtype of <-chan B. And chan<- B a subtype of chan<- A.

In that sense, read-only types have the potential to at least theoretically allow variance for slices. While []int still wouldn't be a subtype of []interface{}, we could make ro []int a subtype of ro []interface{} (borrowing the syntax from the proposal).

Lastly, I want to emphasize that all of these are just the theoretical issues with adding variance to Go's type system. I consider them harder, but even if we could solve them we would still run into practical issues. The most pressing of which is that subtypes have different memory representations:

var (
    // super pseudo-code to illustrate
    x *bytes.Buffer // unsafe.Pointer
    y io.ReadWriter // struct{ itable *itab; value unsafe.Pointer }
                    // where itable has two entries
    z io.Reader     // struct{ itable *itab; value unsafe.Pointer }
                    // where itable has one entry

So even though you might think that all interfaces have the same memory representation, they actually don't, because the method tables have a different assumed layout. So in code like this

func Do(f func() io.Reader) {
    r := f()

func F() io.Reader {
    return new(bytes.Buffer)

func G() io.ReadWriter {
    return new(bytes.Buffer)

func H() *bytes.Buffer {
    return new(bytes.Buffer)

func main() {
    // All of F, G, H should be subtypes of func() io.Reader

there still needs to be a place where the return value of H is wrapped into an io.Reader and there needs to be a place where the itable of the return value of G is transformed into the correct format expected for an io.Reader. This isn't a huge problem for func: The compiler can generate the appropriate wrappers at the call site in main. There is a performance overhead, but only code that actually uses this form of subtyping needs to pay it. However, it becomes significant problem for slices.

For slices, we must either a) convert the []int into an []interface{} when passing it, meaning an allocation and complete copy. Or b) delay the conversion between int and interface{} until the access, which would mean that every slice access now has to go through an indirect function call - just in case anyone would ever pass us a subtype of what we are expecting. Both options seem prohibitively expensive for Go's goals.

03. June 2018 23:20:00

30. May 2018


12. May 2018

Atsutane's Blog

Neuanfang mit Pelican

Die letzten Jahre war es hier recht ruhig. Das hatte verschiedene Gründe, unter anderem weil ich mich mit dem zuvor verwendeten octopress nicht wirklich anfreunden konnte. Ruby ist keine Sprache, welche ich häufig verwende und da octopress nach Aktualisierungen auf meinen Systemen leider häufiger Probleme mit inkompatiblen Gems hatte, war mir dann irgendwann auch die Lust vergangen.

Nun ein Neuanfang, diesmal mit Pelican, welches auf Python setzt, eine Sprache die ich deutlich häufiger verwende. Wenn es hier zu Inkompatibilitäten kommt muss ich hoffentlich deutlich weniger Zeit investieren. Abgesehen davon war es hier mit dem Standardtheme auch recht einfach Aufrufe externer HTTP Server zu entfernen.

by Thorsten Töpper at 12. May 2018 19:00:00

07. May 2018

sECuREs Webseite

Installing Dell’s Ubuntu image on an XPS 9360

Warning: if you don’t understand one of the steps, don’t blindly follow them, but ask a friend for help instead! Be sure to have a known-working backup before messing with your system.


I recently got a Dell XPS 13 9360 (2017), which of course I would like to use with Linux. I figured I’d give Dell’s Ubuntu version a try, as it is the closest I can get to a supported Linux offering on a modern laptop.

Unfortunately, Dell doesn’t sell the XPS 9360 in its shop anymore (and I don’t like its successor due to the lack of USB A ports), so I had to resort to buying a version that comes with Windows.

You can obtain the recovery image dell-bto-xenial-dino2-mlk-A00-iso-20161021-0.iso from, provided you have the system tag of an XPS 9360 that came with Linux — ask a friend if you accidentally purchased the Windows version. Tip: the service tag is the BIOS serial number, which is included in the output of lshw(1) and similar tools.

Making the image bootable

I don’t understand why this image is not bootable by default. The device it was generated for never had a CD/DVD drive, so what good is using an ISO 9660 image?

Anyway, to make the image bootable, I formatted a USB thumb drive with a FAT file system and installed GRUB in such a way that it will loopback-boot into the ISO image (this is option 2 from

grub-mkimage -o bootx64.efi -p /efi/boot -O x86_64-efi \
 fat iso9660 part_gpt part_msdos \
 normal boot linux configfile loopback chain \
 efifwsetup efi_gop efi_uga \
 ls search search_label search_fs_uuid search_fs_file \
 gfxterm gfxterm_background gfxterm_menu test all_video loadenv \
 exfat ext2 ntfs btrfs hfsplus udf
cat >grub.cfg <<'EOT'
set timeout=3
set color_highlight=black/light-magenta

menuentry 'Ubuntu ISO' {
        set isofile="/efi/boot/dell-bto-xenial-dino2-mlk-A00-iso-20161021-0.iso"
        loopback loop $isofile
        linux (loop)/casper/vmlinuz.efi boot=casper iso-scan/filename=$isofile noprompt noeject quiet splash persistent --
        initrd (loop)/casper/initrd.lz

sudo mkfs.vfat /dev/sdc
pmount /dev/sdc
mkdir -p /media/sdc/efi/boot
cp bootx64.efi *.iso grub.cfg /media/sdc/efi/boot
pumount sdc

Making the installer work

To get the installer to work, I had to comment out the self.genuine checks in /usr/lib/ubiquity/plugins/dell-{bootstrap,recovery}.py, then start ubiquity from the terminal.

Switch to LUKS full-disk encryption

Unfortunately, the recovery installation only offers you to encrypt your homedir, whereas I would like to encrypt my entire disk. Here are the steps I took to enable LUKS full-disk encryption.

First, I copied the root file system’s data to my backup storage. Pay attention to copying the files as root, otherwise setuid and setgid bits might get lost.

# mount /dev/mapper/nvme0n1p3 /mnt/root
# rsync -aXx --relative --numeric-ids /mnt/root/ root@storage:xps/
# umount /mnt/root

Then, I created a new boot partition, encrypted the root partition and re-created the file system:

# cat <<'EOT' | sfdisk /dev/nvme0n1
label: gpt
label-id: C64A87D1-CA61-4BF2-81E6-0216EE6BC4C0
device: /dev/nvme0n1
unit: sectors
first-lba: 34
last-lba: 2000409230

/dev/nvme0n1p1 : start=          34, size=      488248, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, name="EFI System Partition"
/dev/nvme0n1p2 : start=      488282, size=     5380681, type=EBD0A0A2-B9E5-4433-87C0-68B6B72699C7, name="fat32"
/dev/nvme0n1p3 : start=     5869568, size=     2097152, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4
/dev/nvme0n1p4 : start=     7966720, size=  1992442511, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4
# mkfs.ext2 /dev/nvme0n1p3
# cryptsetup luksFormat /dev/nvme0n1p4
# cryptsetup luksOpen /dev/nvme0n1p4 nvme0n1p3_crypt
# mkfs.ext4 /dev/mapper/nvme0n1p4_crypt

Next, I restored the data:

# mount /dev/mapper/nvme0n1p4_crypt /mnt/root
# mount /dev/nvme0n1p3 /mnt/root/boot
# mkdir /mnt/root/boot/efi
# mount /dev/nvme0n1p1 /mnt/root/boot/efi
# rsync -aXx --numeric-ids storage:xps/ /mnt/root

And finally, I fixed the boot partition:

# mount -o bind /dev /mnt/root/dev
# mount -t sysfs sysfs /mnt/root/sys
# mount -t proc proc /mnt/root/proc
# chroot /mnt/root /bin/bash

# apt install cryptsetup
# echo nvme0n1p4_crypt UUID=$(blkid -o value -s UUID /dev/nvme0n1p4) none luks,discard > /etc/crypttab
# cat >/etc/fstab <<EOT
UUID=$(blkid -o value -s UUID /dev/mapper/nvme0n1p4_crypt) / ext4 errors=remount-ro 0 1
UUID=$(blkid -o value -s UUID /dev/nvme0n1p3) /boot ext2 defaults 0 0
UUID=$(blkid -o value -s UUID /dev/nvme0n1p1) /boot/efi vfat umask=0077 0 1
# update-initramfs -u
# update-grub
# grub-install

# umount /mnt/root/{proc,sys,dev,boot/efi,boot,}

07. May 2018 07:10:00

01. May 2018


Bericht aus dem Technoseum Mannheim

Wir sind zurück von unserer Tour durch’s Technoseum.

Unter dem Motto „Mensch und Maschine“ haben wir einen Ausflug quer durch die Geschichte der Industrialisierung gemacht. An verschiedenen Stationen konnten wir uns den Übergang von der Hand- zur Maschinenarbeit genauer ansehen und unterschiedliche Maschinenantriebe und -steuerungen selbst ausprobieren. Besonders in den Bann gezogen hat uns die Wippendrehbank – vielleicht haben wir das erste DIY-Projekt für unsere neue Werkstatt gefunden?

Nach einer gemeinschaftlichen Montage eines Spielzeugautos mit dem Zweiarm-Roboter YuMi konnten wir Technoseum-Maskottchen Paul noch bei seiner Tanz-Vorführung zuschauen.

Die Zeit nach der Führung nutzten wir spontan, um der Sonderausstellung „Entscheiden“ noch einen Besuch abzustatten.

Technoseum Impressionen

by flederrattie at 01. May 2018 00:00:00

28. April 2018


17. April 2018



Blog von Jakob “koebi” Schnell. Alltägliches über Musik, Computer, Mathe und wassonstsopassiert. Zweisprachig Englisch/Deutsch. Gegebenenfalls auch irgendwann konsistent. Erstellt mittels hugo unter Verwendung von hugo-Xmin

17. April 2018 16:58:07

sECuREs Webseite

kinX: USB Hub

This post is part of a series of posts about the kinX project.


The Kinesis Advantage comes with a built-in 2-port USB hub. That hub uses a proprietary connector to interface with a PS/2 keyboard controller, so it cannot be used with a USB keyboard controller. As the built-in hub is the natural place to connect the Logitech unified receiver dongle, not being able to use the hub is mildly annoying.

The kinX MK66F keyboard controller presently needs two USB cables: one connected to the USBFS port to supply the PCB with power and receive firmware updates (via the Teensy bootloader chip), and one connected to the USBHS port for the actual keyboard device.

Lastly, even if the original built-in USB hub had internal ports (instead of a PS/2 converter), it only supports USB 1.1, nullifying any latency improvements.

Hence, I decided to build a drop-in replacement USB 2.0 hub with 2 external USB ports and 2 internal USB ports, using the same proprietary connector as the original, so that the original keyboard USB cable could be re-used.

Design phase

Unfortunately, I could not find an open hardware USB 2.0 hub design on the internet, so I started researching various USB hub chips. I quickly discarded the idea of using USB 3 due to its much stricter requirements.

In the end, I decided to go with the Cypress HX2VL series because of their superior documentation: I found a detailed data sheet, an evaluation board, the associated schematics, design checklist/guidelines, and even the evaluation board’s bill of materials.

This is what the finished build of my design looks like:


After completing my first build, I tested a few USB devices with my hub. The Logitech unified receiver dongle and the YubiKey worked fine. However, my external hard drive and my USB memory stick did not work. In the syslog, I would see:

kernel: usb 1-14.4.4: rejected 1 configuration due to insufficient available bus power

This is because the USB specification limits bus-powered hubs to 100mA per port. While high power usage does not come as a surprise for the external hard disk, it turns out that even my USB memory stick requires 200mA. This was a surprise, because that stick works on other, commercial bus-powered USB hubs.

A closer look reveals that all 3 commercial USB hubs I have tested claim to be self-powered (i.e. using an external power supply), even though they are not. This way, the kernel’s power limitation is circumvented, and up to 500mA can be used per port. In practice, the host port only supplies 500mA, so the user must be careful not to plug in devices which require more than 500mA in aggregate.

I changed the SELFPWR configuration pin to have my hub claim it was self-powered, too, and that made all USB devices I tested work fine.

EEPROM programming

When debugging the power issue, I originally thought the Maximum Power setting in the hub’s USB device descriptor needed to be raised. This turned out to not be correct: the Maximum Power refers to the power which the hub uses for its own circuitry, not the power it passes through to connected devices.

Nevertheless, it’s a nice touch to modify the device descriptor to put in a custom vendor name, product name and serial number: that way, the device shows up with a recognizable name in your syslog or lsusb(8) output, and udev rules can be used to apply settings based on the serial number.

To modify the device descriptor, an EEPROM (electrically erasable programmable read-only memory) needs to be added to the design, from which the HX2VL will read configuration.

The HX2VL allows field-programming of the connected EEPROM, i.e. writing to it via the USB hub chip. I found the Windows-only tool hard to set up on a modern Windows installation, so I wondered whether I could build a simpler to use tool.

Under the covers, the tool merely sends commands with the vendor-specific request code 14 via USB, specifying an index of the two-byte word to read/write. This can be replicated in a few lines of Go:

dev, _ := usb.OpenDeviceWithVIDPID(0x04b4, 0x6570)
eepromRequest := 14
wIndex := 0 // [0, 63] for 128 bytes of EEPROM
  eepromRequest, 0, wIndex, make([]byte, 2))

The EEPROM contents are well-described in the HX2VL data sheet, so the rest is easy.

See for the tool.

Lessons learnt

  • If possible, design the PCB in such a way that components you think you don’t need (e.g. the EEPROM) can optionally be soldered on. This would have saved me a PCB design/fabrication cycle.

  • Get the evaluation board to figure out the configuration you need (e.g. self-powered vs. bus-powered).

Next up

The last post introduces the processing latency measurement firmware for the FRDM-K66F development board and draws a conclusion.

17. April 2018 15:49:00

kinX: latency measurement

This post is part of a series of posts about the kinX project.

Latency measurement

End-to-end latency consists of 3 parts:

  1. input latency (keyboard)
  2. processing latency (computer)
  3. output latency (monitor)

During the development of the kinX keyboard controller, I realized that measuring processing latency was quite simple with my hardware: I could start a timer when sending a key press HID report to the computer and measure the elapsed time when I would receive a reply from the computer.

The key to send is the Caps Lock key, because unlike other keys it results in a reply: a HID report telling the keyboard to turn the Caps Lock LED on.

Measurement device

To make this measurement technique accessible to as many people as possible, I decided to pull it out of my kinX keyboard controller and instead build it using the FRDM-K66F evaluation board, which uses the same microcontroller.

The FRDM-K66F can be bought for about 60 USD at big electronics shops, e.g. Digi-Key.

Find the firmware at


To determine the lowest processing latency one can possibly get for userspace applications on a Linux system, I wrote a small program which uses Linux’s evdev API to receive key presses and react to a Caps Lock keypress as quickly as it can by turning the Caps Lock LED on.

Find the program at

The following layers are exercised when measuring this program:

  • USB host controller
  • Linux kernel (USB and input subsystems)
  • input event API (evdev)

Notably, graphical interfaces such as X11 or Wayland are excluded.

The measurements can be verified using Wireshark’s usbmon capturing, which provides a view of the USB bus from the computer’s perspective, excluding USB poll latency and USB transaction time.

Using Ubuntu 17.10, I measured a processing latency of 152 μs on average.


Now let’s see whether my current editor of choice adds significant latency.

Using a few lines of Emacs Lisp, I instructed Emacs to turn on the Caps Lock LED whenever a key is inserted into the current buffer. In combination with remapping the Caps Lock key to any other key (e.g. “a”), this allows us to measure Emacs’s processing latency.

On the same Ubuntu 17.10 installation used above, Emacs 25.2.2 adds on average 278 μs to the baseline processing latency.

Find the code at

End-to-end latency

With the kinX keyboard controller, we can achieve the following end-to-end latency:

contributor latency
Matrix scan ≈ 100 μs
USB poll ≈ 125 μs
Linux ≈ 152 μs
Emacs ≈ 278 μs

This sums up to ≈ 655 μs on average. On top of that, we have output latency within [0, 16ms] due to the 60 Hz refresh rate of our monitors.

Note that using a compositor adds one frame of output latency.

Input latency perception

A natural question to ask is how well humans can perceive input latency. After all, keyboards have been manufactured for many years, and if input latency was really that important, surely manufacturers would have picked up on this fact by now?

I ran a little unscientific experiment in the hope to further my understanding of this question at the most recent Chaos Communication Congress in Leipzig.

In the experiment, I let 17 people play a game on a specially prepared keyboard. In each round, the game reconfigures the keyboard to either have additional input latency or not, decided at random. The player can then type a few keys and make a decision. If the player can correctly indicate whether additional input latency was present in more than 50% of the cases, the player is said to be able to distinguish latency at that level. On each level, the game decreases the additional input latency: it starts with 100ms, then 75ms, then 50ms, etc.

The most sensitive player reliably recognized an additional 15ms of input latency.

Some players could not distinguish 75ms of additional input latency.

Every player could distinguish 100ms of additional input latency.

My take-away is that many people cannot perceive slight differences in input latency at all, explaining why keyboard manufacturers don’t optimize for low latency.

Reducing input latency still seems worthwhile to me: even if the reduction happens under the threshold at which you can perceive differences in input latency, it buys you more leeway in the entire stack. In other words, you might now be able to turn on an expensive editor feature which previously slowed down typing too much.


When I started looking into input latency, my keyboard had dozens of milliseconds of latency. I found an easy win in the old firmware, then hit a wall, started the kinX project and eventually ended up with a keyboard with just 0.225ms input latency.

Even if I had not reduced the input latency of my keyboard at all, I feel that this project was a valuable learning experience: I now know a lot more about PCB design, ARM microcontrollers, USB, HID, etc.

Typing on a self-built keyboard feels good: be it because of the warm fuzzy feeling of enjoying the fruits of your labor, or whether the input latency indeed is lower, I’m happy with the result either way.

Lastly, I can now decidedly claim that the processing latency of modern computers is perfectly fine (remember our 152 μs + 278 μs measurement for Linux + Emacs), and as long as you pick decent peripherals, your end-to-end latency will be fine, too.

What’s next?

By far the biggest factor in the end-to-end latency is the monitor’s refresh rate, so getting a monitor with a high refresh rate and no additional processing latency would be the next step in reducing the end-to-end latency.

As far as the keyboard goes, the matrix scan could be eliminated by wiring up each individual key to a microcontroller with enough GPIO pins. The USB poll delay could be eliminated by switching to USB 3, but I don’t know of any microcontrollers which have USB 3 built-in yet. Both of these improvements are likely not worth the effort.

17. April 2018 15:49:00

kinX: keyboard controller with <0.225ms input latency

This post is part of a series of posts about the kinX project.


10 years ago I got a Kinesis Advantage keyboard. I wrote about the experience of learning to touch-type using the ergonomic NEO layout in my (German) post “Neo-Layout auf einer Kinesis-Tastatur”.

The Kinesis Advantage is still the best keyboard I’ve ever used, and I use one every day, both at the office and at home.

I had two reasons to start modifying the keyboard:

  1. I prefer Cherry MX blue key switches over the Cherry MX brown key switches the Kinesis comes with. Nowadays, you can get a Kinesis with Cherry MX red key switches, which felt okay in a quick test.

  2. The original keyboard controller has (had?) a bug where modifier keys such as Shift would get stuck at least once a week: you would press Shift, press A, release A, release Shift, press A and see AA instead of Aa.

I solved issue ① with the help of the excellent Kinesis technical support, who sold me unpopulated PCBs so that I could solder on my own key switches.

Issue ② was what lead to my first own keyboard controller build, which I documented in “Hacking your own Kinesis keyboard controller” (2013).

Then, the topic of input latency popped into my filter bubble, with excellent posts such as Pavel Fatin’s “Typing with pleasure”. I started wondering what input latency I was facing, and whether/how I could reduce it.

Given that I was using a custom keyboard controller, it was up to me to answer that question. After trying to understand and modify the firmware I had been using for the last 4 years, I realized that I first needed to learn much more about how keyboards work.

I firmly believe that creating a good environment is vital for development, especially for intrinsically-motivated side projects like this one. Hence, I set the project aside until a colleague gifted me his old Kinesis which had intermittent issues. I removed the electronics and started using that keyboard as my development keyboard.

Sources of input latency

A keyboard controller has 3 major tasks:

  • matrix scan: to avoid physically connecting every single key switch directly to a microcontroller (requiring a large number of GPIO pins), most keyboards use a matrix. See “How to make a keyboard — the matrix” for a good explanation.

  • debouncing: when pressing a key switch, it doesn’t cleanly change from a low voltage level to a high voltage level (or vice-versa). Instead, it bounces: the voltage level rapidly oscillates until it eventually reaches a stable steady state. Because one key press shouldn’t result in a whole bunch of characters, keyboard controllers need to debounce the key press.

  • USB: nowadays, keyboards use USB (for example to be compatible with laptops, which generally don’t have PS/2 ports), so the keyboard’s state needs to be communicated to the computer via USB.

Here’s an illustration of the timing of a key press being handled by a naive keyboard controller implementation:

In the worst case, a key press happens just after a keyboard matrix scan. The first source of latency is the time it takes until the next keyboard matrix scan happens.

Depending on the implementation, the key press now sits in a data structure, waiting for the debounce time to pass.

Finally, once the key press was successfully debounced, the device must wait until the USB host polls it before it can send the HID report.

Unless the matrix scan interval is coupled to the USB poll interval, the delays are additive, and the debounce time is usually constant: in the best case, a key press happens just before a matrix scan (0ms) and gets debounced (say, 5ms) just before a USB poll (0ms).

Teensy 3.6 controller (for learning)

My old keyboard controller used the Teensy++, which is fairly dated at this point. I decided a good start of the project would be to upgrade to the current Teensy 3.6, cleaning up the schematics on the way.

To ensure I understand all involved parts, I implemented a bare-metal firmware almost from scratch: I cobbled together the required startup code, USB stack and, most importantly, key matrix scanning code.

In my firmware, the Teensy 3.6 runs at 180 MHz (compared to the Teensy++’s 16 MHz) and scans the keyboard matrix in a busy loop (as opposed to on USB poll). Measurements confirmed a matrix scan time of only 100μs (0.1ms).

I implemented debouncing the way it is described in Yin Zhong’s “Keyboard Matrix Scanning and Debouncing”: by registering a key press/release on the rising/falling edge and applying the debounce time afterwards, effectively eliminating debounce latency.

Note that while the Cherry MX datasheet specifies a debounce time of 5ms, I found it necessary to increase the time to 10ms to prevent bouncing in some of my key switches, which are already a few years old.

I set the USB device descriptor’s poll interval to 1, meaning poll every 1 USB micro frame, which is 1ms long with USB 1.x (Full Speed).

This leaves us at an input latency within [0ms, 1.1ms]:

  • ≤ 0.1ms scan latency
  • 0ms debounce latency
  • ≤ 1ms USB poll latency

Can we reduce the input latency even further? The biggest factor is the USB poll interval.

USB High Speed

With USB 2.0 High Speed, the micro frame duration is reduced to 125μs (0.125ms). The NXP MK66F micro controller in the Teensy 3.6 has two USB ports:

  1. the Full Speed-only USBFS port, which is used by the Teensy 3.6
  2. the High Speed-capable USBHS port, which the Teensy optionally uses for host mode, with experimental software support (at the time of writing)

While the software support was a road block which could conceivably be solved, I also faced a mechanical problem: the available space in the Kinesis keyboard and the position of the USB High Speed port pins on the Teensy 3.6 unfortunately prevented installing any sort of breakout board to actually use the port.

I decided to move from the Teensy 3.6 to my own design with the same microcontroller.

MK66F keyboard controller

To make development pleasant, I connected a USB-to-serial adapter (to UART0) and a “rebootor” (to PROGHEAD): another Teensy with a special firmware to trigger programming mode. This way, I could set my editor’s compile-command to make && teensy_loader_cli -r …, compiling the code, uploading and booting into the resulting firmware with a single keyboard shortcut.

I based the firmware for this controller on NXP’s SDK examples, to ensure I get a well-tested and maintained USB stack for the USBHS port. I did some measurements to confirm the stack does not add measurable extra latency, so I did not see any value in me maintaining a custom USB stack.

The firmware can be found at

The hardware can be found at

Using USB 2.0 High Speed leaves us at an input latency within [0ms, 0.225ms]:

  • ≤ 0.1ms scan latency
  • 0ms debounce latency
  • ≤ 0.125ms USB poll latency

Lessons learnt

  • In the future, I will base custom designs on the vendor’s development board (instead of on the Teensy). This way, the vendor-provided code could be used without any modifications.

  • While the Teensy bootloader means getting started with the microcontroller just requires a USB port, using a JTAG connector for development would be more powerful: not only does it replace the combination of Teensy bootloader, serial and rebootor, but it also supports debugging with gdb.

Next up

The second post motivates and describes building a drop-in replacement USB hub for the Kinesis Advantage keyboard.

17. April 2018 15:49:00