Planet NoName e.V.

2019-10-23

michael-herbst.com

adcc: A versatile toolkit for rapid development of algebraic-diagrammatic construction methods

Over the part year or so a side project I have been working off and on has been the adcc code. This python module allows to perform excited states calculation for simulating molecular spectra based on algebraic-diagrammatic construction (ADC) methods. The speciality of the adcc code is that (a) multiple host programs are supported for supplying a Hartree-Fock reference, right now pyscf, Psi4, molsturm and veloxchem, and (b) that its hybrid C++/python design allows to implement a large part of the ADC procedure including iterative solver schemes in python and still achieve the speed required for larger ADC calculations. The code development has been a collaboration with multiple members from the old group of my PhD times, the Dreuw group from Heidelberg. Finally, today, we are publicly releasing the code following the submission of our paper on adcc last week (download link).

I have already talked about adcc and ADC methods before (see also my research interests), so I'll be brief with my summary about these. The ADC scheme builds upon the fact that the poles of the polarisation propagator, i.e. the two-particle Green's function, contains the physical information for describing excited electronic states and their properties. Using a Møller-Plesset partitioning of the Schrödinger Hamiltonian one may construct a perturbative expansion for said polarisation propagator. In the ADC approach this is done using a particular representation basis for the many-body states of the Schrödinger Hamiltonian, the so-called intermediate state basis. The outcome are a perturbative series of matrices, the so-called ADC matrices corresponding to the ADC(n) series of methods. The ADC matrices contain exactly the spectral information of the polarisation propagator consistent to order n of a Møller-Plesset expansion of the correlated ground state. Diagonalising this matrix, thus allows to obtain excitation energies and excited-state properties. For more details, see for example the theory section in the adcc documentation.

As we discuss in our paper, in adcc the basic procedure of building and diagonalising the ADC matrix is implemented in a hybrid C++/python code. The aim was to keep everything, including intermediate results, accessible from the python layer, while implementing the computationally intensive parts in C++. As a result, the code is comparable to C++-only implementations in speed, such that it can be used to e.g. treat biomolecular systems. The iterative diagonalisation procedure and other aspects of the numerics are implemented completely in python, facilitating experimenting on the numerical level in the future. Multiple variants of ADC are available up to and including ADC(3), i.e. a treatment of the polarisation propagator consistent to third-order. Originally adcc grew out of my desire to use another ADC code, which is developed in the Dreuw group, namely the adcman code, interactively from python. In its current state, however, adcc has well grown beyond being a plain wrapper around adcman. The internal code structure and workflows of both packages have very little in common, albeit the general methodology to solve ADC of both packages is certainly related. The design of adcc aims to simplify working with ADC methods as a developer, but also facilitating the use and extension of existing workflows by everyday users. For more details and a couple of examples, see the paper, or the Performing calculations with adcc section of the adcc documentation. Installing adcc is simple. Given that a few requirements (such as openblas on Linux) are available, it boils down to using pip:

pip install pybind11
pip install adcc

The full abstract reads

ADC-connect (adcc) is a hybrid python/C++ module for performing excited state calculations based on the algebraic-diagrammatic construction scheme for the polarisation propagator (ADC). Key design goal is to restrict adcc to this single purpose and facilitate connection to external packages, e.g., for obtaining the Hartree-Fock references, plotting spectra, or modelling solvents. Interfaces to four self-consistent field codes have already been implemented, namely pyscf, psi4, molsturm, and veloxchem. The computational workflow, including the numerical solvers, are implemented in python, whereas the working equations and other expensive expressions are done in C++. This equips adcc with adequate speed, making it a flexible toolkit for both rapid development of ADC-based computational spectroscopy methods as well as unusual computational workflows. This is demonstrated by three examples. Presently, ADC methods up to third order in perturbation theory are available in adcc, including the respective core-valence separation and spin-flip variants. Both restricted or unrestricted Hartree-Fock references can be employed.

by Michael F. Herbst at 2019-10-23 21:00 under electronic structure theory, theoretical chemistry, adcc, algebraic-diagrammatic construction

2019-10-23

sECuREs website

Network Storage PC Hardware (2019)

One of my two NAS builds recently died, so I bought a new one until I find some time to debug the old one. Since a couple of people have been asking me what I would recommend nowadays based on my November 2016 article “Gigabit NAS (running CoreOS)”, I figured I would share the new hardware listing:

Price Type Article
54.00 CHF Case Silverstone SST-SG05BB-Lite (cube)
60.40 CHF Mainboard AsRock AB350 Gaming-ITX/ac (AM4, AMD B350, Mini ITX)
Be sure to update the UEFI to the latest version (6.00)!
62.30 CHF CPU AMD A6-9500E (2, AM4, 3GHz)
20.10 CHF Cooler Arctic Alpine AM4 Passive
42.80 CHF RAM Kingston ValueRAM (1x, 8GB, DDR4-2400, DIMM 288)
29.00 CHF Fan Noctua Nf-s12a ULN (120mm, 1x)
55.00 CHF PSU Silverstone ST30SF 300W SFX (300W)
27.50 CHF System disk Intenso High Performance (120GB, 2.5") SATA
351.10 CHF total sum

In November 2016 I paid only 225 CHF, i.e. 126 CHF less.

Why is this build so much more expensive? There are two major reasons:

The AM4 platform

The AM4 platform replaced the AM1 APU series as the cheapest broadly available AMD platform.

As you might have gathered from the links in the hardware listing above, I define “broadly available” as available at digitec, a large electronics shop in Zürich.

They offer same-day orders for pick-up in their Zürich location during Weekdays and on Saturdays, so it is kind of like being on a hardware support plan :-)

Unfortunately, the cheapest AM4 CPU is a lot more expensive (+ 23.31 CHF).

Also, there are (currently?) no AM4 mainboards with DC barrel power plugs, meaning more expensive ATX power supplies (+ 26.30 CHF) become necessary.

Additional components: fan and system disk

Definitely invest in the Noctua 120mm ULN (Ultra Low Noise) fan (+ 29.00 CHF). The fan that comes in the Silverstone case is pretty noisy, and that might be bothersome if you don’t have the luxury of stashing your NAS away in the basement.

In my last build, I had an SSD lying around that I used as system disk, this time I had to buy one (+ 27.50 CHF).

Note that I intentionally picked a SATA SSD over an M.2 SSD: the M.2 slot of the AB350 is on the back of the mainboard, so an M.2 SSD is harder to reach. The performance disadvantage of a SATA SSD compared to an M.2 SSD might be measurable, but irrelevant for my day-to-day usage. Quickly accessing the physical hardware is more important.

at 2019-10-23 00:00

2019-10-18

michael-herbst.com

JuliaParis: Electronic structure simulations using Julia

Yesterday we had the first instance of the Julia Paris Meetup. As the name suggests a couple of us Julia enthusiasts in the French capital got together for one evening of talks and chatting about everyone's favourite programming language. It was a great occasion to finally meet some people in person and gather around for some technical chit-chat till the late hours at one of the bars of the 11th arrondissement after the talks.

I myself was pretty excited about the evening and not quite sure what kind of people to expect. Unsurprisingly most people, who turned up had an academic background with focus on number crunching of some sort, mostly high-performance linear algebra or optimisation problems. To my big surprise, the number of people working in a industrial context was a lot larger than I expected, even though most did not use the language to a large extend in their everyday work. Still, after the evening I am yet again convinced that Julia is spreading beyond dusty desks in universities. Certainly, a great part in this plays the dedicated community of people, willing to invest a Sunday evening here and there to work on just the very thing required to advance the language and the Julia ecosystem step by step. Even though this has already been clear to me to some extent before Thursday, actually meeting the community and listening to stories and experiences from the Julia trenches, make the whole impression more vivid. I am already looking forward to the next meeting (details as always on https://julia-users-paris.github.io).

On that note, I was very happy, that I had the chance to play an active part in the evening by being one of the first to present some stories of my own. Naturally I talked about my endeavours with Julia while developing the DFTK code. Since I was not too sure about the audience, I tried to get across what makes Julia an ideal language for our efforts, what things we are currently working on and what will be our focus next and how we expect Julia to help. I'm not going to repeat the details about DFTK here, because these can be found in a previous article. See also the second half of this article. The slides of my talk can be downloaded here as usual:

Link Licence
Electronic-structure simulations using Julia (Slides 1st Julia Paris Meetup) Creative Commons License

by Michael F. Herbst at 2019-10-18 22:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry

2019-10-09

Insanity Industries

Hijacking the desktop directory with /tmp

Just sharing a quick convenience idea: merging Desktop and /tmp to use it as a quick drop directory. Sounds stupid? Bear with me.

XDG user directories

XDG user directories are a freedesktop.org de-facto standard for specifying typical directories within a user’s home, such as Documents, Downloads, Pictures, Music etc.. Filemanagers typically mark these xdg-user-dirs with special folder icons, so even if you haven’t explicitly bothered yet, you might have noticed.

Setting one of the XDG user directories allows your system (typically impersonated by your desktop environement) to ensure these folders are always existing, allows file managers to specifically mark them and to allow applications to save things there (for example applications can look up a user-specified download-folder to drop downloads there instead of reconfiguring each application separately). Also, many file managers and dialogs show you those XDG-user-dirs as quicklinks:

save dialog for webresource

You can see that Desktop as well as the other xdg-dirs I use are accessible as quicklinks on the left.

Hijacking the desktop dir

I personally do not use files on my desktop, as I most of the time do not see my desktop anyways, therefore I have little use for the Desktop directory. However, as many file dialogs offer quicklinks to those directories, we can use the otherwise useless desktop dir for another purpose: putting a quick-drop directory there. In my case this is /tmp, for two reasons:

  • easily accessible from the commandline to pull/push files from/to there, just two characters and a tab to type
  • already set up as a tmpfs, so it lives in my computer’s memory and will be cleared and empty again upon next reboot

By specifying /tmp as the Desktop directory we now have it accessible as a quicklink in close to every file dialog and can quickly drop things from any terminal location there and then access them with two clicks from every filesystem dialog (and the other way round).

Setting xdg-user-dirs

To do so we now need to set that directory. This is done via the file ~/.config/user-dirs.dirs, which for our purpose should contain the line:

XDG_DESKTOP_DIR="/tmp"

You can set additional directories, the scheme will be always the same, XDG_<dirname>_DIR, you can query directories from the commandline via xdg-user-dir <dirname> to use them in your own scripts. <dirname> can be almost anything (although most software out there of course doesn’t consider any value for that).

drop-directory for display on your desktop

If you use a Desktop Environement that actually displays icons on your desktop, using /tmp is most likely unsatistfactory, as many applications use it for files and directories and therefore your desktop will be cluttered with those files.

Instead, we can create a directory specifically for that purpose, for example you could create a folder /drop in your filesystem root. This already retains the first property mentioned above, being accessible with few keystrokes (3 + tab: /dr<tab>, as /d<tab> collides with /dev). The second property we want is being in-memory, so we have simple regular clearance upon reboot and don’t collect clutter over time. We can achieve that by placing the line

tmpfs /drop tmpfs rw,nodev,noexec,nosuid,mode=1777 0 0

in /etc/fstab. This mounts a tmpfs to /drop in read-write mode.

nodev,noexec,nosuid are optional, but tightening permissions a little more rarely hurts. mode=1777 makes the directory readwriteeverythingable for every user on the system, but sets the sticky bit, so only the user placing things can delete them. Certainly not the most locked-down mode you could set, so potentially you want to tighten the screws here a little more.

Adapt the entry in ~/.config/user-dirs.dirs accordingly and you’re ready to go, now having an actual drop directory where you can simply drag and drop things onto your desktop.

Further reading

by Jonas Große Sundrup at 2019-10-09 14:22

2019-10-05

RaumZeitLabor

BrexEat Party, the British Brunch

“I Want to Break Fast” - Queen

Dear fellow RaumZeitLaboroyals!

Anlässlich des möglicherweise (oder auch doch noch nicht) bevorstehenden EU-Austritts des Vereinigten Königreichs, wollen wir uns am Sonntag, den 27. Oktober zum gemeinsamen BrexEat im RZL treffen.

Steckt eure schönste Tudor-Rose-Spiegeleibrosche ans Jäckchen und sattelt die Corgis, denn ab 11.30 Uhr gibt es nicht nur einen Toast auf unsere britischen Nachbarn.

Würstchen, Hash Browns, Porridge, Bacon - für Veggie- und Beefeater wird es ein Full English Breakfast geben.

Falls ihr zum Union Snack kommen wollt, meldet euch bitte bis zum 23.10. per Mail an. Damit es auch in Zukunft noch “Keep calm and culinary on” heißt und wir weiter lustige Motto-Essfeste machen können, bitten wir um einen Unkostenbeitrag von mindestens 10 Euro pro Person.

God save our baked bean!
Eure Earls of Sandwich

SirCorgi

by flederrattie at 2019-10-05 00:00

2019-09-30

michael-herbst.com

Teasers for some upcoming Julia activities

A quick teaser for some Julia-related activities around Paris, I am involved with in the next months.

  • 17 Oct 2019, 19:00, QuantStack (27 Rue du Chemin Vert 75011, Paris): The first Julia meet-up of Paris. We want to gather the people in and around Paris, who are interested in or working with Julia and stimulate exchanging experiences and ideas. I'll give a short talk about DFTK.jl and François Févotte will talk about AccurateArithmetic.jl. More details on https://julia-users-paris.github.io. Feel free to drop by if you are around!

  • 13 Dec 2019, 09:00, Sorbonne Université, LJLL and LCT labs of Jussieu Campus. Julia Day (Journée Julia): I'll do an introductory session to Julia in the morning and talk a little bit about useful and helpful Julia packages (linear algebra, non-linear solvers, etc.) in the afternoon. followed by an introduction to DFTK.jl. For more details on the day see the website.

by Michael F. Herbst at 2019-09-30 22:00 under Julia, DFTK, workshop, programming and scripting

2019-09-29

sECuREs website

Debian Code Search: positional index, TurboPFor-compressed

See the Conclusion for a summary if you’re impatient :-)

Motivation

Over the last few months, I have been developing a new index format for Debian Code Search. This required a lot of careful refactoring, re-implementation, debug tool creation and debugging.

Multiple factors motivated my work on a new index format:

  1. The existing index format has a 2G size limit, into which we have bumped a few times, requiring manual intervention to keep the system running.

  2. Debugging the existing system required creating ad-hoc debugging tools, which made debugging sessions unnecessarily lengthy and painful.

  3. I wanted to check whether switching to a different integer compression format would improve performance (it does not).

  4. I wanted to check whether storing positions with the posting lists would improve performance of identifier queries (= queries which are not using any regular expression features), which make up 78.2% of all Debian Code Search queries (it does).

I figured building a new index from scratch was the easiest approach, compared to refactoring the existing index to increase the size limit (point ①).

I also figured it would be a good idea to develop the debugging tool in lock step with the index format so that I can be sure the tool works and is useful (point ②).

Integer compression: TurboPFor

As a quick refresher, search engines typically store document IDs (representing source code files, in our case) in an ordered list (“posting list”). It usually makes sense to apply at least a rudimentary level of compression: our existing system used variable integer encoding.

TurboPFor, the self-proclaimed “Fastest Integer Compression” library, combines an advanced on-disk format with a carefully tuned SIMD implementation to reach better speeds (in micro benchmarks) at less disk usage than Russ Cox’s varint implementation in github.com/google/codesearch.

If you are curious about its inner workings, check out my “TurboPFor: an analysis”.

Applied on the Debian Code Search index, TurboPFor indeed compresses integers better:

Disk space

 
8.9G codesearch varint index

 
5.5G TurboPFor index

Switching to TurboPFor (via cgo) for storing and reading the index results in a slight speed-up of a dcs replay benchmark, which is more pronounced the more i/o is required.

Query speed (regexp, cold page cache)

 
18s codesearch varint index

 
14s TurboPFor index (cgo)

Query speed (regexp, warm page cache)

 
15s codesearch varint index

 
14s TurboPFor index (cgo)

Overall, TurboPFor is an all-around improvement in efficiency, albeit with a high cost in implementation complexity.

Positional index: trade more disk for faster queries

This section builds on the previous section: all figures come from the TurboPFor index, which can optionally support positions.

Conceptually, we’re going from:

type docid uint32
type index map[trigram][]docid

…to:

type occurrence struct {
    doc docid
    pos uint32 // byte offset in doc
}
type index map[trigram][]occurrence

The resulting index consumes more disk space, but can be queried faster:

  1. We can do fewer queries: instead of reading all the posting lists for all the trigrams, we can read the posting lists for the query’s first and last trigram only.
    This is one of the tricks described in the paper “AS-Index: A Structure For String Search Using n-grams and Algebraic Signatures” (PDF), and goes a long way without incurring the complexity, computational cost and additional disk usage of calculating algebraic signatures.

  2. Verifying the delta between the last and first position matches the length of the query term significantly reduces the number of files to read (lower false positive rate).

  3. The matching phase is quicker: instead of locating the query term in the file, we only need to compare a few bytes at a known offset for equality.

  4. More data is read sequentially (from the index), which is faster.

Disk space

A positional index consumes significantly more disk space, but not so much as to pose a challenge: a Hetzner EX61-NVME dedicated server (≈ 64 €/month) provides 1 TB worth of fast NVMe flash storage.

 
 6.5G non-positional

 
123G positional

 
  93G positional (posrel)

The idea behind the positional index (posrel) is to not store a (doc,pos) tuple on disk, but to store positions, accompanied by a stream of doc/pos relationship bits: 1 means this position belongs to the next document, 0 means this position belongs to the current document.

This is an easy way of saving some space without modifying the TurboPFor on-disk format: the posrel technique reduces the index size to about ¾.

With the increase in size, the Linux page cache hit ratio will be lower for the positional index, i.e. more data will need to be fetched from disk for querying the index.

As long as the disk can deliver data as fast as you can decompress posting lists, this only translates into one disk seek’s worth of additional latency. This is the case with modern NVMe disks that deliver thousands of MB/s, e.g. the Samsung 960 Pro (used in Hetzner’s aforementioned EX61-NVME server).

The values were measured by running dcs du -h /srv/dcs/shard*/full without and with the -pos argument.

Bytes read

A positional index requires fewer queries: reading only the first and last trigram’s posting lists and positions is sufficient to achieve a lower (!) false positive rate than evaluating all trigram’s posting lists in a non-positional index.

As a consequence, fewer files need to be read, resulting in fewer bytes required to read from disk overall.

As an additional bonus, in a positional index, more data is read sequentially (index), which is faster than random i/o, regardless of the underlying disk.

1.2G
19.8G
21.0G regexp queries

4.2G (index)
10.8G (files)
15.0G identifier queries

The values were measured by running iostat -d 25 just before running bench.zsh on an otherwise idle system.

Query speed

Even though the positional index is larger and requires more data to be read at query time (see above), thanks to the C TurboPFor library, the 2 queries on a positional index are roughly as fast as the n queries on a non-positional index (≈4s instead of ≈3s).

This is more than made up for by the combined i/o matching stage, which shrinks from ≈18.5s (7.1s i/o + 11.4s matching) to ≈1.3s.

3.3s (index)
7.1s (i/o)
11.4s (matching)
21.8s regexp queries

3.92s (index)
≈1.3s
5.22s identifier queries

Note that identifier query i/o was sped up not just by needing to read fewer bytes, but also by only having to verify bytes at a known offset instead of needing to locate the identifier within the file.

Conclusion

The new index format is overall slightly more efficient. This disk space efficiency allows us to introduce a positional index section for the first time.

Most Debian Code Search queries are positional queries (78.2%) and will be answered much quicker by leveraging the positions.

Bottomline, it is beneficial to use a positional index on disk over a non-positional index in RAM.

at 2019-09-29 00:00

2019-09-16

michael-herbst.com

Interdisciplinary software for electronic structure theory: adcc and DFTK

Last week, from Wednesday to Friday, I paid the Technical University of Munich (TUM) a short visit for two great opportunities to present my work: Once in the theoretical chemistry group of Professor Wolfgang Domcke and once at the annual meeting of the modeling, analysis and simulation of molecular systems (MoAnSi) interest group, a crowd consisting mostly of mathematicians and fellow modelling scientists.

Talk at Domcke group

The first presentation I gave last Wednesday in the Domcke group, where I talked about challenges and approaches to interdisciplinary software projects. I also presented two projects of my own, namely DFTK and adcc. About these codes I already gave a few details in previous articles and thus be rather brief with respect to details about them.

In the part of the talk on the plane-wave density-functional theory code DFTK I mainly focused on the question why we decided to use Julia as a programming language. For this I summarised Julia's capabilities and performance and gave a short teaser of Julia code in a Jupyter notebook. This notebook shows a Julia implementation of a Jacobi-Davidson iterative eigensolver and discusses concrete instantiations of the code in various situations. This includes a mixed-precision algorithm, which transfers intermediate results obtained in a lower precision (e.g. single precision) to higher (i.e. double) precision and then continues the iterations at this level. Other examples were diagonalisations using sparse matrices or GPU computations. Most notably, for all these cases the actual Jacobi-Davidson code was only written once and not explictly adapted to fit the respective datastructures or computational backends. Last, but not least the notebook teasered a small automatic differentiation example as well.

The main focus of the talk, however, was the adcc code, which implements the algebraic-diagrammatic construction scheme for the polarisation propagator (ADC) and allows to drive this computation from any SCF code using a python-based interface. In the presentation I sketch content from our upcoming software paper on adcc and present three examples for calculations, which we are able to do with only few lines of python code. This includes an analysis of the core-valence separation (CVS) error for water, a sophisticated combination of freezing extra core and valence orbitals alongside the CVS and a case of an ADC(2) modelling of nile red (426 basis functions) in an explicit solvent model (polarisable embedding). The code for the second example, namely the combination of CVS with freezing other orbitals, was also shown in an example notebook. Especially this part of the talk was well-received, providing fruit for many ADC-related and adcc-related discussions afterwards.

MoAnSi workshop

Directly after the day at the Domcke group I attended the MoAnSi meeting, which took place last Thursday and Friday, bringing together a bunch of mathematicians interested in computational methods for molecular science. Similar to other interdisciplinary meetings overlapping chemistry and maths, like the one in Oberwolfach, the atmosphere was friendly and open. Plenty of discussions were stimulating for thought about fundamental aspects of quantum chemistry, one usually does not think about as a chemist.

For this reason my second talk was tailored a bit more to the mathematician crowd, which meant that I focused more on DFTK and how it could be used in the context of mathematically-driven research of density-functional theory. Apart from the slides and the Julia examples I also presented some actual DFTK code, emphasising how Julia leads to extremely concise and readable programs. During the Q&A, I mentioned a blog article about a Julia case study using the language for a petascale computation. The respective project is the Celeste code and a summary of the hurdles and difficulties can be found in a post by Keno Fischer on the Julia Computing Blog.

Slides and notebooks for both talks are linked in the main text above or can be downloaded here:

Link Licence
Interdisciplinary software for electronic structure theory: adcc and DFTK (Slides seminar Domcke group) Creative Commons License
Two Julia examples (Notebook) GNU GPL v3
adcc Demo: Combining CVS and frozen orbitals (Notebook) GNU GPL v3
DFTK: The density-functional toolkit (Slides MoAnSi talk) Creative Commons License

by Michael F. Herbst at 2019-09-16 17:00 under talk, electronic structure theory, Julia, HPC, molsturm, DFTK, theoretical chemistry, adcc, algebraic-diagrammatic construction

2019-08-17

sECuREs website

distri: a Linux distribution to research fast package management

Over the last year or so I have worked on a research linux distribution in my spare time. It’s not a distribution for researchers (like Scientific Linux), but my personal playground project to research linux distribution development, i.e. try out fresh ideas.

This article focuses on the package format and its advantages, but there is more to distri, which I will cover in upcoming blog posts.

Motivation

I was a Debian Developer for the 7 years from 2012 to 2019, but using the distribution often left me frustrated, ultimately resulting in me winding down my Debian work.

Frequently, I was noticing a large gap between the actual speed of an operation (e.g. doing an update) and the possible speed based on back of the envelope calculations. I wrote more about this in my blog post “Package managers are slow”.

To me, this observation means that either there is potential to optimize the package manager itself (e.g. apt), or what the system does is just too complex. While I remember seeing some low-hanging fruit¹, through my work on distri, I wanted to explore whether all the complexity we currently have in Linux distributions such as Debian or Fedora is inherent to the problem space.

I have completed enough of the experiment to conclude that the complexity is not inherent: I can build a Linux distribution for general-enough purposes which is much less complex than existing ones.

① Those were low-hanging fruit from a user perspective. I’m not saying that fixing them is easy in the technical sense; I know too little about apt’s code base to make such a statement.

Key idea: packages are images, not archives

One key idea is to switch from using archives to using images for package contents. Common package managers such as dpkg(1) use tar(1) archives with various compression algorithms.

distri uses SquashFS images, a comparatively simple file system image format that I happen to be familiar with from my work on the gokrazy Raspberry Pi 3 Go platform.

This idea is not novel: AppImage and snappy also use images, but only for individual, self-contained applications. distri however uses images for distribution packages with dependencies. In particular, there is no duplication of shared libraries in distri.

A nice side effect of using read-only image files is that applications are immutable and can hence not be broken by accidental (or malicious!) modification.

Key idea: separate hierarchies

Package contents are made available under a fully-qualified path. E.g., all files provided by package zsh-amd64-5.6.2-3 are available under /ro/zsh-amd64-5.6.2-3. The mountpoint /ro stands for read-only, which is short yet descriptive.

Perhaps surprisingly, building software with custom prefix values of e.g. /ro/zsh-amd64-5.6.2-3 is widely supported, thanks to:

  1. Linux distributions, which build software with prefix set to /usr, whereas FreeBSD (and the autotools default), which build with prefix set to /usr/local.

  2. Enthusiast users in corporate or research environments, who install software into their home directories.

Because using a custom prefix is a common scenario, upstream awareness for prefix-correctness is generally high, and the rarely required patch will be quickly accepted.

Key idea: exchange directories

Software packages often exchange data by placing or locating files in well-known directories. Here are just a few examples:

  • gcc(1) locates the libusb(3) headers via /usr/include
  • man(1) locates the nginx(1) manpage via /usr/share/man.
  • zsh(1) locates executable programs via PATH components such as /bin

In distri, these locations are called exchange directories and are provided via FUSE in /ro.

Exchange directories come in two different flavors:

  1. global. The exchange directory, e.g. /ro/share, provides the union of the share sub directory of all packages in the package store.
    Global exchange directories are largely used for compatibility, see below.

  2. per-package. Useful for tight coupling: e.g. irssi(1) does not provide any ABI guarantees, so plugins such as irssi-robustirc can declare that they want e.g. /ro/irssi-amd64-1.1.1-1/out/lib/irssi/modules to be a per-package exchange directory and contain files from their lib/irssi/modules.

Search paths sometimes need to be fixed

Programs which use exchange directories sometimes use search paths to access multiple exchange directories. In fact, the examples above were taken from gcc(1) ’s INCLUDEPATH, man(1) ’s MANPATH and zsh(1) ’s PATH. These are prominent ones, but more examples are easy to find: zsh(1) loads completion functions from its FPATH.

Some search path values are derived from --datadir=/ro/share and require no further attention, but others might derive from e.g. --prefix=/ro/zsh-amd64-5.6.2-3/out and need to be pointed to an exchange directory via a specific command line flag.

FHS compatibility

Global exchange directories are used to make distri provide enough of the Filesystem Hierarchy Standard (FHS) that third-party software largely just works. This includes a C development environment.

I successfully ran a few programs from their binary packages such as Google Chrome, Spotify, or Microsoft’s Visual Studio Code.

Fast package manager

I previously wrote about how Linux distribution package managers are too slow.

distri’s package manager is extremely fast. Its main bottleneck is typically the network link, even at high speed links (I tested with a 100 Gbps link).

Its speed comes largely from an architecture which allows the package manager to do less work. Specifically:

  1. Package images can be added atomically to the package store, so we can safely skip fsync(2) . Corruption will be cleaned up automatically, and durability is not important: if an interactive installation is interrupted, the user can just repeat it, as it will be fresh on their mind.

  2. Because all packages are co-installable thanks to separate hierarchies, there are no conflicts at the package store level, and no dependency resolution (an optimization problem requiring SAT solving) is required at all.
    In exchange directories, we resolve conflicts by selecting the package with the highest monotonically increasing distri revision number.

  3. distri proves that we can build a useful Linux distribution entirely without hooks and triggers. Not having to serialize hook execution allows us to download packages into the package store with maximum concurrency.

  4. Because we are using images instead of archives, we do not need to unpack anything. This means installing a package is really just writing its package image and metadata to the package store. Sequential writes are typically the fastest kind of storage usage pattern.

Fast installation also make other use-cases more bearable, such as creating disk images, be it for testing them in qemu(1) , booting them on real hardware from a USB drive, or for cloud providers such as Google Cloud.

Fast package builder

Contrary to how distribution package builders are usually implemented, the distri package builder does not actually install any packages into the build environment.

Instead, distri makes available a filtered view of the package store (only declared dependencies are available) at /ro in the build environment.

This means that even for large dependency trees, setting up a build environment happens in a fraction of a second! Such a low latency really makes a difference in how comfortable it is to iterate on distribution packages.

Package stores

In distri, package images are installed from a remote package store into the local system package store /roimg, which backs the /ro mount.

A package store is implemented as a directory of package images and their associated metadata files.

You can easily make available a package store by using distri export.

To provide a mirror for your local network, you can periodically distri update from the package store you want to mirror, and then distri export your local copy. Special tooling (e.g. debmirror in Debian) is not required because distri install is atomic (and update uses install).

Producing derivatives is easy: just add your own packages to a copy of the package store.

The package store is intentionally kept simple to manage and distribute. Its files could be exchanged via peer-to-peer file systems, or synchronized from an offline medium.

distri’s first release

distri works well enough to demonstrate the ideas explained above. I have branched this state into branch jackherer, distri’s first release code name. This way, I can keep experimenting in the distri repository without breaking your installation.

From the branch contents, our autobuilder creates:

  1. disk images, which…

  2. a package repository. Installations can pick up new packages with distri update.

  3. documentation for the release.

The project website can be found at https://distr1.org. The website is just the README for now, but we can improve that later.

The repository can be found at https://github.com/distr1/distri

Project outlook

Right now, distri is mainly a vehicle for my spare-time Linux distribution research. I don’t recommend anyone use distri for anything but research, and there are no medium-term plans of that changing. At the very least, please contact me before basing anything serious on distri so that we can talk about limitations and expectations.

I expect the distri project to live for as long as I have blog posts to publish, and we’ll see what happens afterwards. Note that this is a hobby for me: I will continue to explore, at my own pace, parts that I find interesting.

My hope is that established distributions might get a useful idea or two from distri.

There’s more to come: subscribe to the distri feed

I don’t want to make this post too long, but there is much more!

Please subscribe to the following URL in your feed reader to get all posts about distri:

https://michael.stapelberg.ch/posts/tags/distri/feed.xml

Next in my queue are articles about hermetic packages and good package maintainer experience (including declarative packaging).

Feedback or questions?

I’d love to discuss these ideas in case you’re interested!

Please send feedback to the distri mailing list so that everyone can participate!

at 2019-08-17 16:36

2019-08-17

sECuREs website

Linux package managers are slow

I measured how long the most popular Linux distribution’s package manager take to install small and large packages (the ack(1p) source code search Perl script and qemu, respectively).

Where required, my measurements include metadata updates such as transferring an up-to-date package list. For me, requiring a metadata update is the more common case, particularly on live systems or within Docker containers.

All measurements were taken on an Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz running Docker 1.13.1 on Linux 4.19, backed by a Samsung 970 Pro NVMe drive boasting many hundreds of MB/s write performance.

See Appendix B for details on the measurement method and command outputs.

Measurements

Keep in mind that these are one-time measurements. They should be indicative of actual performance, but your experience may vary.

ack (small Perl program)

distribution package manager data wall-clock time rate
Fedora dnf 107 MB 29s 3.7 MB/s
NixOS Nix 15 MB 14s 1.1 MB/s
Debian apt 15 MB 4s 3.7 MB/s
Arch Linux pacman 6.5 MB 3s 2.1 MB/s
Alpine apk 10 MB 1s 10.0 MB/s

qemu (large C program)

distribution package manager data wall-clock time rate
Fedora dnf 266 MB 1m8s 3.9 MB/s
Arch Linux pacman 124 MB 1m2s 2.0 MB/s
Debian apt 159 MB 51s 3.1 MB/s
NixOS Nix 262 MB 38s 6.8 MB/s
Alpine apk 26 MB 2.4s 10.8 MB/s


The difference between the slowest and fastest package managers is 30x!

How can Alpine’s apk and Arch Linux’s pacman be an order of magnitude faster than the rest? They are doing a lot less than the others, and more efficiently, too.

Pain point: too much metadata

For example, Fedora transfers a lot more data than others because its main package list is 60 MB (compressed!) alone. Compare that with Alpine’s 734 KB APKINDEX.tar.gz.

Of course the extra metadata which Fedora provides helps some use case, otherwise they hopefully would have removed it altogether. The amount of metadata seems excessive for the use case of installing a single package, which I consider the main use-case of an interactive package manager.

I expect any modern Linux distribution to only transfer absolutely required data to complete my task.

Pain point: no concurrency

Because they need to sequence executing arbitrary package maintainer-provided code (hooks and triggers), all tested package managers need to install packages sequentially (one after the other) instead of concurrently (all at the same time).

In my blog post “Can we do without hooks and triggers?”, I outline that hooks and triggers are not strictly necessary to build a working Linux distribution.

Thought experiment: further speed-ups

Strictly speaking, the only required feature of a package manager is to make available the package contents so that the package can be used: a program can be started, a kernel module can be loaded, etc.

By only implementing what’s needed for this feature, and nothing more, a package manager could likely beat apk’s performance. It could, for example:

  • skip archive extraction by mounting file system images (like AppImage or snappy)
  • use compression which is light on CPU, as networks are fast (like apk)
  • skip fsync when it is safe to do so, i.e.:
    • package installations don’t modify system state
    • atomic package installation (e.g. an append-only package store)
    • automatically clean up the package store after crashes

Current landscape

Here’s a table outlining how the various package managers listed on Wikipedia’s list of software package management systems fare:

name scope package file format hooks/triggers
AppImage apps image: ISO9660, SquashFS no
snappy apps image: SquashFS yes: hooks
FlatPak apps archive: OSTree no
0install apps archive: tar.bz2 no
nix, guix distro archive: nar.{bz2,xz} activation script
dpkg distro archive: tar.{gz,xz,bz2} in ar(1) yes
rpm distro archive: cpio.{bz2,lz,xz} scriptlets
pacman distro archive: tar.xz install
slackware distro archive: tar.{gz,xz} yes: doinst.sh
apk distro archive: tar.gz yes: .post-install
Entropy distro archive: tar.bz2 yes
ipkg, opkg distro archive: tar{,.gz} yes

Conclusion

As per the current landscape, there is no distribution-scoped package manager which uses images and leaves out hooks and triggers, not even in smaller Linux distributions.

I think that space is really interesting, as it uses a minimal design to achieve significant real-world speed-ups.

I have explored this idea in much more detail, and am happy to talk more about it in my post “Introducing the distri research linux distribution".

There are a couple of recent developments going into the same direction:

Appendix B: measurement details

ack

You can expand each of these:

Fedora’s dnf takes almost 30 seconds to fetch and unpack 107 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y ack
Fedora Modular 30 - x86_64            4.4 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  3.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           17 MB/s |  19 MB     00:01
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  44 Packages

Total download size: 13 M
Installed size: 42 M
[…]
real	0m29.498s
user	0m22.954s
sys	0m1.085s

NixOS’s Nix takes 14s to fetch and unpack 15 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i perl5.28.2-ack-2.28'
unpacking channels...
created 2 symlinks in user environment
installing 'perl5.28.2-ack-2.28'
these paths will be fetched (14.91 MiB download, 80.83 MiB unpacked):
  /nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2
  /nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48
  /nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man
  /nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27
  /nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31
  /nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53
  /nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16
  /nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28
copying path '/nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man' from 'https://cache.nixos.org'...
copying path '/nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27' from 'https://cache.nixos.org'...
copying path '/nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16' from 'https://cache.nixos.org'...
copying path '/nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48' from 'https://cache.nixos.org'...
copying path '/nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53' from 'https://cache.nixos.org'...
copying path '/nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31' from 'https://cache.nixos.org'...
copying path '/nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2' from 'https://cache.nixos.org'...
copying path '/nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28' from 'https://cache.nixos.org'...
building '/nix/store/q3243sjg91x1m8ipl0sj5gjzpnbgxrqw-user-environment.drv'...
created 56 symlinks in user environment
real	0m 14.02s
user	0m 8.83s
sys	0m 2.69s

Debian’s apt takes almost 10 seconds to fetch and unpack 16 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y ack-grep)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [233 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8270 kB]
Fetched 8502 kB in 2s (4764 kB/s)
[…]
The following NEW packages will be installed:
  ack ack-grep libfile-next-perl libgdbm-compat4 libgdbm5 libperl5.26 netbase perl perl-modules-5.26
The following packages will be upgraded:
  perl-base
1 upgraded, 9 newly installed, 0 to remove and 60 not upgraded.
Need to get 8238 kB of archives.
After this operation, 42.3 MB of additional disk space will be used.
[…]
real	0m9.096s
user	0m2.616s
sys	0m0.441s

Arch Linux’s pacman takes a little over 3s to fetch and unpack 6.5 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm ack)
:: Synchronizing package databases...
 core            132.2 KiB  1033K/s 00:00
 extra          1629.6 KiB  2.95M/s 00:01
 community         4.9 MiB  5.75M/s 00:01
[…]
Total Download Size:   0.07 MiB
Total Installed Size:  0.19 MiB
[…]
real	0m3.354s
user	0m0.224s
sys	0m0.049s

Alpine’s apk takes only about 1 second to fetch and unpack 10 MB.

% docker run -t -i alpine
/ # time apk add ack
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
(1/4) Installing perl-file-next (1.16-r0)
(2/4) Installing libbz2 (1.0.6-r7)
(3/4) Installing perl (5.28.2-r1)
(4/4) Installing ack (3.0.0-r0)
Executing busybox-1.30.1-r2.trigger
OK: 44 MiB in 18 packages
real	0m 0.96s
user	0m 0.25s
sys	0m 0.07s

qemu

You can expand each of these:

Fedora’s dnf takes over a minute to fetch and unpack 266 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y qemu
Fedora Modular 30 - x86_64            3.1 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  2.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           20 MB/s |  19 MB     00:00
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  262 Packages
Upgrade    4 Packages

Total download size: 172 M
[…]
real	1m7.877s
user	0m44.237s
sys	0m3.258s

NixOS’s Nix takes 38s to fetch and unpack 262 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i qemu-4.0.0'
unpacking channels...
created 2 symlinks in user environment
installing 'qemu-4.0.0'
these paths will be fetched (262.18 MiB download, 1364.54 MiB unpacked):
[…]
real	0m 38.49s
user	0m 26.52s
sys	0m 4.43s

Debian’s apt takes 51 seconds to fetch and unpack 159 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y qemu-system-x86)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8426 kB]
Fetched 8574 kB in 1s (6716 kB/s)
[…]
Fetched 151 MB in 2s (64.6 MB/s)
[…]
real	0m51.583s
user	0m15.671s
sys	0m3.732s

Arch Linux’s pacman takes 1m2s to fetch and unpack 124 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm qemu)
:: Synchronizing package databases...
 core       132.2 KiB   751K/s 00:00
 extra     1629.6 KiB  3.04M/s 00:01
 community    4.9 MiB  6.16M/s 00:01
[…]
Total Download Size:   123.20 MiB
Total Installed Size:  587.84 MiB
[…]
real	1m2.475s
user	0m9.272s
sys	0m2.458s

Alpine’s apk takes only about 2.4 seconds to fetch and unpack 26 MB.

% docker run -t -i alpine
/ # time apk add qemu-system-x86_64
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
[…]
OK: 78 MiB in 95 packages
real	0m 2.43s
user	0m 0.46s
sys	0m 0.09s

at 2019-08-17 16:27

2019-07-30

michael-herbst.com

Interdisciplinary software development in electronic structure theory (MQM Poster)

At the beginning of the month, from 30th June till 5th July, I attended the 9th Molecular Quantum Mechanics Conference in my old home Heidelberg. This occasion was a great opportunity to catch up with friends and colleges both in science and from my old circle in Heidelberg as well.

At the conference I presented a poster entitled Modern interdisciplinary software development in electronic-structure theory. which was highly related to my former talk in Lille a couple of weeks ago. In the poster I wanted to to motivate the use of modern languages and software development techniques and demonstrate its opportunities for enabling different communities (like mathematicians and chemists) to work jointly on the same codes. As examples I discuss three projects of my own.

Out of these, the molsturm code for basis-function-independent method development I have described in detail in our paper last year. Some words about our mathematically-driven approach in the development of the density-functional toolkit (DFTK) I already gave in my previous article. In this article I will thus focus on the third code presented on the poster, namely adcc, short for ADC connect.

The adcc project is a project I started a few years ago during my time at the Dreuw group in Heidelberg. I continue to work on it as a side project during my time at the CERMICS, jointly with some of my old group members. Our main aim is to provide one independent building block for excited states computations using the algebraic-diagrammatic construction (ADC). This means that adcc does not implement any self-consistent field algorithm itself, much rather it runs on top of any Hartree-Fock code. In practice as of now four SCF codes have been connected to adcc via our Python interface, including pyscf, Psi4 and molsturm. But Python is not only used as a glue language between SCF and ADC, much rather it allows to orchestrate the full computational procedure. See the attached poster or the introductory chapter of the adcc documentation for some examples.

A third aspect where Python plays a crucial role in adcc are the iterative solver algorithms. For example the Davidson procedure we use for solving the ADC equations is implemented purely in high-level Python code. This allows to rapidly investigate new numerical schemes and approaches in the context of ADC. We are, however, not loosing too much performance by this choice, because the time-consuming tensor contractions we require are still done in a C++ core library and only called from Python. Inside our core library we in turn use the libtensor code for tensor computations. Overall adcc therefore shows a comparable performance to the adcman code, a pure C++ implementation of ADC also developed in Heidelberg on top of libtensor.

Currently adcc is not yet publicly available, but we are in the stage of finalising adcc for a first public release within the Gator framework. A first standalone release of the code is planned within the upcoming months as well. The adcc documentation, however, is already available and gives an idea of adcc in practice.

As usual I attach my poster as a pdf below.

Link Licence
Poster Modern interdisciplinary software development in electronic structure theory Creative Commons License

by Michael F. Herbst at 2019-07-30 11:00 under poster, electronic structure theory, Julia, HPC, molsturm, DFTK, theoretical chemistry, MQM, adcc, algebraic-diagrammatic construction

2019-07-20

sECuREs website

Linux distributions: Can we do without hooks and triggers?

Hooks are an extension feature provided by all package managers that are used in larger Linux distributions. For example, Debian uses apt, which has various maintainer scripts. Fedora uses rpm, which has scriptlets. Different package managers use different names for the concept, but all of them offer package maintainers the ability to run arbitrary code during package installation and upgrades. Example hook use cases include adding daemon user accounts to your system (e.g. postgres), or generating/updating cache files.

Triggers are a kind of hook which run when other packages are installed. For example, on Debian, the man(1) package comes with a trigger which regenerates the search database index whenever any package installs a manpage. When, for example, the nginx(8) package is installed, a trigger provided by the man(1) package runs.

Over the past few decades, Open Source software has become more and more uniform: instead of each piece of software defining its own rules, a small number of build systems are now widely adopted.

Hence, I think it makes sense to revisit whether offering extension via hooks and triggers is a net win or net loss.

Hooks preclude concurrent package installation

Package managers commonly can make very little assumptions about what hooks do, what preconditions they require, and which conflicts might be caused by running multiple package’s hooks concurrently.

Hence, package managers cannot concurrently install packages. At least the hook/trigger part of the installation needs to happen in sequence.

While it seems technically feasible to retrofit package manager hooks with concurrency primitives such as locks for mutual exclusion between different hook processes, the required overhaul of all hooks¹ seems like such a daunting task that it might be better to just get rid of the hooks instead. Only deleting code frees you from the burden of maintenance, automated testing and debugging.

① In Debian, there are 8620 non-generated maintainer scripts, as reported by find shard*/src/*/debian -regex ".*\(pre\|post\)\(inst\|rm\)$" on a Debian Code Search instance.

Triggers slow down installing/updating other packages

Personally, I never use the apropos(1) command, so I don’t appreciate the man(1) package’s trigger which updates the database used by apropos(1). The process takes a long time and, because hooks and triggers must be executed serially (see previous section), blocks my installation or update.

When I tell people this, they are often surprised to learn about the existance of the apropos(1) command. I suggest adopting an opt-in model.

Unnecessary work if programs are not used between updates

Hooks run when packages are installed. If a package’s contents are not used between two updates, running the hook in the first update could have been skipped. Running the hook lazily when the package contents are used reduces unnecessary work.

As a welcome side-effect, lazy hook evaluation automatically makes the hook work in operating system images, such as live USB thumb drives or SD card images for the Raspberry Pi. Such images must not ship the same crypto keys (e.g. OpenSSH host keys) to all machines, but instead generate a different key on each machine.

Why do users keep packages installed they don’t use? It’s extra work to remember and clean up those packages after use. Plus, users might not realize or value that having fewer packages installed has benefits such as faster updates.

I can also imagine that there are people for whom the cost of re-installing packages incentivizes them to just keep packages installed—you never know when you might need the program again…

Implemented in an interpreted language

While working on hermetic packages (more on that in another blog post), where the contained programs are started with modified environment variables (e.g. PATH) via a wrapper bash script, I noticed that the overhead of those wrapper bash scripts quickly becomes significant. For example, when using the excellent magit interface for Git in Emacs, I encountered second-long delays² when using hermetic packages compared to standard packages. Re-implementing wrappers in a compiled language provided a significant speed-up.

Similarly, getting rid of an extension point which mandates using shell scripts allows us to build an efficient and fast implementation of a predefined set of primitives, where you can reason about their effects and interactions.

② magit needs to run git a few times for displaying the full status, so small overhead quickly adds up.

Incentivizing more upstream standardization

Hooks are an escape hatch for distribution maintainers to express anything which their packaging system cannot express.

Distributions should only rely on well-established interfaces such as autoconf’s classic ./configure && make && make install (including commonly used flags) to build a distribution package. Integrating upstream software into a distribution should not require custom hooks. For example, instead of requiring a hook which updates a cache of schema files, the library used to interact with those files should transparently (re-)generate the cache or fall back to a slower code path.

Distribution maintainers are hard to come by, so we should value their time. In particular, there is a 1:n relationship of packages to distribution package maintainers (software is typically available in multiple Linux distributions), so it makes sense to spend the work in the 1 and have the n benefit.

Can we do without them?

If we want to get rid of hooks, we need another mechanism to achieve what we currently achieve with hooks.

If the hook is not specific to the package, it can be moved to the package manager. The desired system state should either be derived from the package contents (e.g. required system users can be discovered from systemd service files) or declaratively specified in the package build instructions—more on that in another blog post. This turns hooks (arbitrary code) into configuration, which allows the package manager to collapse and sequence the required state changes. E.g., when 5 packages are installed which each need a new system user, the package manager could update /etc/passwd just once.

If the hook is specific to the package, it should be moved into the package contents. This typically means moving the functionality into the program start (or the systemd service file if we are talking about a daemon). If (while?) upstream is not convinced, you can either wrap the program or patch it. Note that this case is relatively rare: I have worked with hundreds of packages and the only package-specific functionality I came across was automatically generating host keys before starting OpenSSH’s sshd(8)³.

There is one exception where moving the hook doesn’t work: packages which modify state outside of the system, such as bootloaders or kernel images.

③ Even that can be moved out of a package-specific hook, as Fedora demonstrates.

Conclusion

Global state modifications performed as part of package installation today use hooks, an overly expressive extension mechanism.

Instead, all modifications should be driven by configuration. This is feasible because there are only a few different kinds of desired state modifications. This makes it possible for package managers to optimize package installation.

at 2019-07-20 00:00

2019-06-20

Insanity Industries

Safer shellscripts by default

Often enough shellscripts are the quickest solution to a given problem. However, writing shellscripts is just as hard as using any other programming language, as for example Valve (or rather Steam users) found out the hard way.

However, there are options you can set to make your shellscripts safer, which others have elaborated on already very nicely, therefore we will not look at it in detail here but instead recommend to take a look there. Just briefly, these options summarize to set -eEuo pipefail and make your script terminate on the first command exiting non-zero (including non-zero-exiting commands within a pipe chain) and when you try to use unset variables if set at the beginning of your shellscript. The aforementioned error in the steam-uninstall-script would have been prevented by this.

However, you will have to set these options in every single script. You could put them into your .bashrc, for example, but not all scripts out there and on your system might be written with this option set in mind. As adding these options into every single of your scripts might sound tedious, there is a better solution: writing a shell that is safe by default!

Implementing a safe bash

To do so we recognize that most scripts nowadays start with a line starting with the shebang: #!/path/to/interpreter. In the case of bash this is typically #!/bin/bash.

To be able to simply invoke a safebash instead of bash at this point, create a file /usr/local/bin/safebash containing

#!/bin/bash

# exit immediately when a command fails
set -e

# abort on attempting to use an undefined variable (hello Valve! :))
set -u

# set exitcode of a pipe chain to the exit code of the rightmost
# command not exiting with zero, if there are any
# in combination with -e this aborts on failure within a pipe chain
set -o pipefail

# make functions inherit ERR-traps, so that they fire when set -e takes effect there
set -E

# let subshells inherit these options
export SHELLOPTS

bash "$@"

and make it executable with chmod +x /usr/local/bin/safebash

You can now start your shellscript-file with #!/usr/local/bin/safebash instead of #!/bin/bash and all the aforementioned options will be active without specifying anything explicitly.

by Jonas Große Sundrup at 2019-06-20 17:31

2019-06-12

Mero’s Blog

A bird's eye view of Go

tl;dr: I provide a very high-level overview of what Go-the-language means vs. Go-the-ecosystem vs. Go-an-implementation. I also try to provide specific references to what documentation is most useful for what purpose. See the bottom-most section for that.

When we talk about "Go", depending on context, we can mean very different things. This is my attempt at providing a very high-level overview of the language and ecosystem and to link to the relevant documentation about how each part fits together (it's also a bit of a hodgepodge though, addressing individual random questions I encountered recently). So, let's dive in:

The Go programming language

The bottom turtle is Go, the programming language. It defines the format and meaning of source code and the authoritative source is the Go language specification. If something doesn't conform to the spec, it's not "Go". And conversely, if something isn't mentioned in the spec it's not part of the language. The language spec is maintained by the Go team and versioned, with a new release roughly every six months. At the time I wrote this post, the newest release was version 1.12.

The domain of the spec are

  • The grammar of the language
  • Types, values and their semantics
  • What identifiers are predeclared and what their meaning is
  • How Go programs get executed
  • The special package unsafe (though not all of its semantics)

The spec alone should enable you to write a compiler for Go. And indeed, there are many different compilers.

A Go compiler and runtime

The language spec is a text document, which is not useful in and of itself. For that you need software that actually implements these semantics. This is done by a compiler (which analyzes and checks the source code and transforms it into an executable format) and a runtime (which provides the necessary environment to actually run the code). There are many such combinations and they all differ a bit more or a bit less. Examples are

  • gc, a compiler and runtime written in pure Go (with some assembly) by the Go team themselves and versioned and released together with the language. Unlike other such tools, gc doesn't strictly separate the compiler, assembler and linker - they end up sharing a lot of code and some of the classical responsibilities move or are shared between them. As such, it's in general not possible to e.g. link packages compiled by different versions of gc.
  • gccgo and libgo, a frontend for gcc and a runtime. It's written in C and maintained by the Go team. It lives in the gcc organization though and is released according to the gcc release schedule and thus often lags a bit behind the "latest" version of the Go spec.
  • llgo, a frontend for LLVM. I don't know much else about it.
  • gopherjs, compiling Go code into javascript and using a javascript VM plus some custom code as a runtime. Long-term, it'll probably be made obsolete by gc gaining native support for WebAssembly.
  • tinygo, an incomplete implementation targeting small code size. Runs on either bare-metal micro-controllers or WebAssembly VMs, with a custom runtime. Due to its limitations it doesn't technically implement Go - notably, it doesn't include a garbage collector, concurrency or reflection.

There are more, but this gives you an overview over the variety of implementations. Each of these made potentially different choices for how to implement the language and have their own idiosyncrasies. Examples (some of them a bit exotic, to illustrate) where they might differ are:

  • Size of int/uint - the language allows them to be either 32 or 64 bit wide.
  • How fundamental functionalities of the runtime, like allocation, garbage collection or concurrency are implemented.
  • The order of ranging over a map isn't defined in the language - gc famously explicitly randomizes it, gopherjs uses (last time I checked) whatever the javascript implementation you are running on uses.
  • How much extra space append allocates if it needs to - not however, when it allocates extra space.
  • How conversions between unsafe.Pointer and uintptr happen. gc, in particular, comes with its own set of rules regarding when these conversions are valid and when they aren't. In general, the unsafe package is virtual and implemented in the compiler.

In general, relying on details not mentioned in the spec (in particular the ones mentioned here) makes your program compile with different compilers, but not work as expected. So you should avoid it if possible.

If you install Go via a "normal" way (by downloading it from the website, or installing it via a package manager), you'll get gc and the official runtime by the Go team. And if the context doesn't imply otherwise, when we talk about how "Go does things", we usually refer to gc. It's the main implementation.

The standard library

The standard library is a set of packages that come with Go and can be relied upon to immediately build useful applications with. It too is maintained by the Go team and versioned and released together with the language and compiler. In general the standard library of one implementation will only work with the compiler it comes with. The reason is that most (but not all) of the runtime is part of the standard library (mainly in the packages runtime, reflect, syscall). As the compiler needs to generate code compatible with the used runtime, both need to come from the same version. The API of the standard library is stable and won't change in incompatible ways, so a Go program written against a given version of the standard library will continue to work as expected with future versions of the compiler.

Some implementations use their own version of some or all of the standard library - in particular, the runtime, reflect, unsafe and syscall packages are completely implementation-defined. As an example, I believe that AppEngine Standard used to re-define parts of the standard library for security and safety. In general, implementations try to make that transparent to the user.

There is also a separate set of repositories, colloquially referred to as x or "the subrepositories". They contain packages which are developed and maintained by the Go team with all the same processes, but are not on the same release schedule as the language and have less strict compatibility guarantees (and commitment to maintainership) than Go itself. The packages in there are either experimental (for potential future inclusion in the standard library), not widely useful enough to be included in the standard library or, in rare cases, a way for people on the Go team to work on code using the same review processes they are used to.

Again, when referring to "the standard library" devoid of extra context, we mean the officially maintained and distributed one, hosted on golang.org.

The build tool

To make the language user-friendly, you need a build tool. The primary role of this tool is to find the package you want to compile, find all of its dependencies, and execute the compiler and linker with the arguments necessary to build them. Go (the language) has support for packages, which combine multiple source files into one unit of compilation. And it defines how to import and use other packages. But importantly, it doesn't define how import paths map to source files or how they are laid out on disk. As such, each build tool comes with its own ideas for this. It's possible to use a generic build tool (like Make) for this purpose, but there are a bunch of Go-specific ones:

  • The go tool¹ is the build tool officially maintained by the Go team. It is versioned and released with the language (and gc and the standard library). It expects a directory called GOROOT (from an environment variable, with a compiled default) to contain the compiler, the standard library and various other tools. And it expects all source code in a single directory called GOPATH (from an environment variable, defaulting to $HOME/go or equivalent). Specifically, package a/b is expected to have its source at $GOPATH/src/a/b/c.go etc. And $GOPATH/src/a/b is expected to only contain source files of one package. It also has a mechanism to download a package and its dependencies recursively from an arbitrary server, in a fully decentralized scheme, though it does not support versioning or verification of downloads. The go tool also contains extra tooling for testing Go code, reading documentation (golang.org is served by the Go tool), file bugs, run various tools…
  • gopherjs comes with its own build tool, that largely mimics the Go tool.
  • gomobile is a build tool specifically to build Go code for mobile operating systems.
  • dep, gb, glide,… are community-developed build-tools and dependency managers, each with their own approach to file layout (some are compatible with the go tool, some aren't) and dependency declarations.
  • bazel is the open source version of Google's own build system. While it's not actually Go-specific, I'm mentioning it explicitly due to common claims that idiosyncrasies of the go tool are intended to serve Google's own use cases, in conflict with the needs of the community. However, the go tool (and many public tools) can't be used at Google, because bazel uses an incompatible file layout.

The build tool is what most users directly interface with and as such, it's what largely determines aspects of the Go ecosystem and how packages can be combined and thus how different Go programmers interact. As above, the go tool is what's implicitly referred to (unless other context is specified) and thus its design decisions significantly influence public opinion about "Go". While there are alternative tools and they have wide adoption for use cases like company-internal code, the open source community in general expects code to conform to the expectations of the go tool, which (among other things) means:

  • Be available as source code. The go tool has little support for binary distribution of packages, and what little it has is going to be removed soon.
  • Be documented according to the godoc format.
  • Have tests that can be run via go test.
  • Be fully compilable by a go build (together with the next one, this is usually called being "go-gettable"). In particular, to use go generate if generating source-code or metaprogramming is required and commit the generated artifacts.
  • Namespace import paths with a domain-name as the first component and have that domain-name either be a well-known code hoster or have a webserver running on it, so that go get works and can find the source code of dependencies.
  • Have one package per directory and use build constraints for conditional compilation.

The documentation of the go tool is very comprehensive and probably a good starting point to learn how Go implements various ecosystem aspects.

Tools

Go's standard library includes several packages to interact with Go source code and the x/tools subrepo contains even more. As a result (and due to a strong desire to keep the canonical Go distribution lean), Go has developed a strong culture of developing third-party tools. In general, these tools need to know where to find source code, and might need access to type information. The go/build package implements the conventions used by the Go tool, and can thus also serve as documentation for parts of its build process. The downside is that tools built on top of it sometimes don't work with code relying on other build tools. That's why there is a new package in development which integrates nicely with other build tools.

By its nature the list of Go tools is long and everybody has their own preferences. But broadly, they contain:

In Summary

I wanted to end this with a short list of references for beginners who feel lost. So this is where you should go, if you:

There are many more useful supplementary documents, but this should serve as a good start. Please let me know on Twitter if you are a beginner and there's an area of Go you are missing from this overview (I might follow this up with more specific topics), or a specific reference you found helpful. You can also drop me a note if you're a more experienced Gopher and think I missed something important (but keep in mind that I intentionally left out most references, so as to keep the ways forward crisp and clear :) ).


[1] Note: The Go team is currently rolling out support for modules, which is a unit of code distribution above packages, including support for versioning and more infrastructure to solve some issues with the "traditional" go tool. With that, basically everything in that paragraph becomes obsolete. However, for now the module support exists but is opt-in. And as the point of this article is to provide an overview of the separation of concerns, which doesn't actually change, I felt it was better to stay within ye olden days - for now.

at 2019-06-12 00:00

2019-06-08

michael-herbst.com

Modern software-development techniques in electronic structure theory

A few weeks ago, on 23rd May I received an invitation to visit the Laboratoire de Physique des Lasers, Atomes et Molécules at the Université de Lille and present about my recent work. When it came to selecting a topic, me and my host, Andre Gomes, quickly agreed to focus on discussing more modern approaches to scientific software development and why these can be useful for electronic-structure codes. In retrospect I am very happy for this opportunity to summarise a few ideas and learned lessons from my previous and ongoing software projects. I also took the liberty to sketch possible challenges when it comes to future directions of scientific software from my personal point of view as well as propose some ideas how to tackle them.

From this angle I ended up introducing the talk by a review of the fundamental difficulties of electronic structure theory itself, namely the inherent complexity of the problem (large dimensionality and non-linearity of the respective partial-differential equations). Added to that in recent years the available high-performance computing (HPC) environments have become more heterogeneous and more complex as well. Mixed general-purpose GPU / CPU cluster setups are now standard and making use of both CPU and GPU for computations has become a strict requirement for applying for access to the top-ranked clusters of the world. Projecting into the future, where further "accelerators" such as field-programmable gate arrays (FPGAs), or even more long term quantum computers, might come up, this heterogeneity is only going to increase.

For quickly testing the usefulness of such technologies in the context of quantum chemistry, a key aspect is to be able to reuse the code, which already exists. This means that code needs to be sufficiently high-level in order to be able to adapt to changing hardware architectures. Let be be clear on this: I am not talking about achieving 100% peak performance on every hardware setup, much rather about having a good compromise between the required effort to get the code to work on new hardware and the achieved performance. The point is to quickly get an idea on what is theoretically possible and if further investigation even makes sense.

As I sketch in the talk, changing the hardware, does have an impact on questions such as the optimal basis function, numerical algorithm or even the best-suited physical model. In this "suitable" and "optimal" refers to a simulation procedure, which agrees with the available computational architecture and at the same time is physically meaningful. A consequence of this observation is that experimentation on all levels (basis functions, algorithms, numerics, linear algebra backend) is required, which often calls for interdisciplinary expertise. In an ideal world experts from different fields can thus work on the same code base, approaching the problem from various different angles.

Without a doubt this is a huge task and chances are, that this goal will in fact never be reached. Still I think, there are some opportunities to get a little closer than presently. For this purpose, key aspects are high-level and dynamic programming languages like Julia and python and a clear, modular code design, where individual aspects of the simulation procedure can be easily modified or swapped. Such an approach helps in practice to investigate different basis functions or swap computational backends with only negligible changes to the code base, as I discuss with reference to the molsturm and the DFTK codes (see below). The ability to approach the physical problem on a high level allows mathematicians and physicists to interact with the code abstractly, while computer scientists still have the freedom to tweak performance on a lower level.

I have already discussed the basis-function independent quantum chemistry package molsturm and related work a few times, so I'll skip that in this article. Instead I want to detail some aspects of a more recent project, DFTK.jl, the density-functional toolkit. The idea of this code is to be simple and minimalistic, using existing libraries and codes as much as possible. This makes the code a lot more accessible, which facilitates to construct reduced models, which can be treated in mathematical proof. The hope is that the lessons learned can than be scaled in the same code base to larger and more realistic problems and an HPC environment. The choice of programming language for this project is julia, since it is a high-level and dynamical, but strongly typed language with an impressive performance and out-of-the-box compatibility with existing libraries written in C++, Fortran, python or R. Using features such as multiple dispatch and JIT (just-in-time) compilation of code, Julia seems to be a big step forwards in the direction, where code is written once and then can be translated many times for specific computational backends or hardware. I already presented about Julia in a previous talk a few weeks ago.

All in all I am very thankful for Andre for giving me the opportunity to gather some thoughts about this matter and eventually present them in this talk. The audience in Lille was surprisingly open about the topic and to my big surprise many interesting discussions arose. Throughout the afternoon I spent time with PhD students and staff researchers discussing my ideas in their application context, leading to loads of interesting feedback, helpful ideas, questions and comments. The time in Lille truly flew by very quickly with may aspects still undiscussed after the day. Luckily we all agreed to stay in touch, such that the discussions will likely continue during other meetups in the future.

The slides and the Jupyter notebooks I used during the talk are attached below.

Link Licence
Modern software-development techniques in electronic structure theory (Slides) Creative Commons License
A 5-minute introduction to Julia (Jupyter notebook) GNU GPL v3

by Michael F. Herbst at 2019-06-08 22:00 under talk, electronic structure theory, Julia, HPC, molsturm, DFTK, theoretical chemistry

2019-05-25

Insanity Industries

Random Mac Addresses for Privacy

When connecting to networks, every network device has a defined address to be found in the local network. This is the so called MAC address or hardware address. It can be found via ip link, yielding something like

2: wlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 […]
    link/ether ma:c_:_a:dd:re:ss brd ff:ff:ff:ff:ff:ff

where you find the ma:c_:_a:dd:re:ss for each interface. This address is typically static and unique, wich means for each network you connect to, the operator of the network can recognize you. This means that your online activities over that can also be connected to your name, if the operator has access to your name somehow. An example of this could be the Wifi on trains, where you typically have a named ticket. This way the company must only correlate hardware addresses with names on ticket if they wanted to assign your name to your device. Some supermarkets also experiment with recognizing customers, even if you do not connect to their wifi.

To tackle this problem, one can randomize one’s hardware address regularly. This blogpost shows an exemplary implementation of this with a focus on wifi networks (as there you might roam the most through different networks).

The setup assumed in use for example configurations throughout this blogpost is iwd and systemd-networkd, but information about other setups might be linked where applicable.

Step 1: Scanning for networks

The first point at which your hardware address gets exposed is through scanning for wifi. The solution for this is simple: just use a connection software that randomizes your hardware address for network scans or, even better, uses passive scans so that not even a scan is sent from your hardware, but instead your network chip only listenes which access points are in range. Such a software is for example iwd, which by default only uses passive scans and if an active scan is necessary (for example because the hardware doesn’t support passive scanning) the hardware address is randomized as well as the SSID for the scan is set to the wildcard instead of leaking your saved networks.

For smartphones this is also simple, iPhones introduced hardware randomization for scans in iOS 8, Android can do this since Android 8 (and it is measurably enabled for Stock Android on a Moto Z2 Force as well as on LineageOS).

NetworkManager also seems to be capable of hardware randomization, but it must explicitly be enabled.

Step 2: Connecting

Until iwd implements hardware address randomization also upon connect, there are several things that can be done, at least for traditional Linuxes.

Randomizing address at every boot

Using systemd-networkd, we can set up a file /etc/systemd/network/00-wifi.link containing

# /etc/systemd/network/00-wifi.link
[Match]
MACAddress=ma:c_:_a:dd:re:ss

[Link]
Name=wifi
MACAddressPolicy=random

with ma:c_:_a:dd:re:ss being the original hardware address of the interface. This does two things: it gives the interface a nicer name (optional) and it assigns a fully randomized address to the interface upon every time the device appears (typically every boot).

Randomizing on every rfkill

We can escalate this further by not only randomizing upon every appearance of the interface, but also every time the interface gets switched off via rfkill (either in hardware via a switch or in software via rfkill or a software switch on the computer). If the device is switched off (as only then we can change the hardware address), we can happily change its hardware address as we like, for example using macchanger.

To automate that, we can implement a udev-rule by creating a file /etc/udev/rules.d/macchanger.rules containing

# /etc/udev/rules.d/macchanger.rules
SUBSYSTEM=="rfkill", ATTR{name}=="phy0", ACTION=="change", ATTR{soft}=="1", RUN+="/usr/bin/macchanger -ab wifi"

This rule checks for changes in the rfkill-subsystem and if the hardware device for your wifi (typically phy0) is shut down/blocked (ATTR{soft}=="1") then we invoke macchanger on that interface to scramble the hardware address. If the interface comes back up again you should see the hardware address having changed.

If you use NetworkManager, there is, as mentioned, also an option to enable network randomization as well.

Remember that this udev rule does not trigger upon the first time the interface comes up. It only triggers each time the interface goes down.

Step 3: Close identifying side channels

Avoid leaking hostename and other details via DHCP

If you want to use hardware address randomization to improve your privacy when roaming different networks, this is only working as long as you do not leak other bits of information that can identify your computer. One of these bits of information, that is most likely pretty unique to your machine is its hostname, which is for example typically sent when acquiring an IP address from a DHCP-server. While SendHostname=false in systemd-networkd’s [Network]-section would just suppress leaking your hostname to the network operator, you can go further by adding Anonymize=true (docs here) to the [DHCP]-section of your network-configuration to reduce the amount of identifying information sent to a DHCP-server. This will cause systemd-networkd to implement a number of anonymization options for DHCP as outlined in this IETF-document, including, but not limited to, not leaking your hostname (and disregarding any setting of SendHostname in the process).

With this setup, your interface’s hardware address should change regularly enough to significantly improve your privacy, especially if you visit a network more often then once.

Avoid leaking hardware address upon boot

In case you think about implementing only parts of the proposed things, remember that without the 00-wifi.link you will likely leak your hardware address upon boot if you don’t take any other countermeasures (the easiest being said link-file), as the udev-rule will only trigger upon the first down of the interface.

Final remarks

Setting specific hardware addresses

If you ever happen to be on a network requiring a certain hardware address to permit access, you can set any address you like via macchanger once your interface is rfkilled. As the udev-rule above triggers when the interface goes down, the address you set manually will not be overwritten once you bring the interface back up.

iwd >= 0.18

If you use iwd version 0.18 or higher, you currently have to disable iwd’s new interface handling by adding

[General]
use_default_interface=true

to /etc/iwd/main.conf to be able to manage your interface via systemd-networkd and macchanger until iwd implements hardware address randomization also for the connect-stage.

Android and iOS

With a possible inception of iwd on Android devices (in the far future, and stressing the possibly), the option of randomizing hardware addresses for connections as well might come on the table, but until that happens I’m not aware of any easily accessible option to do so at time of writing.

Revision history

8th July 2019:

  • Recommend Anonymize=true for DHCP-networks instead of plain SendHostname=false

by Jonas Große Sundrup at 2019-05-25 10:55

2019-05-25

RaumZeitLabor

RaumZeitLa La Lab

Ha Ha Hallo liebe RaumZeitLa La Laborierenden,

ihr findet Mathematik super? Ihr hört gerne Musik? Oder groovt bei euch der AlgoRhythmus und ihr mögt sogar beides? Perfekt! Dann solltet ihr den nächsten Ausflug auf gar keinen Fall verpassen.

Am Freitag, den 7. Juni 2019 besuchen wir ab 16 Uhr die Mathematik-Informatik-Station (kurz: MAINS) in Heidelberg und schauen uns die interaktive Ausstellung La La Lab an. Es gibt Animationen und Laserinstallationen rund um die faszinierende Beziehung zwischen Mathe und Musik, zusammen mit verschiedenen KIs kann komponiert und musiziert werden, und am Schluss dürfen alle Exponate auch noch digital mit nach Hause genommen werden.

Ihr wollt bei der Führung dabei sein? Dann schreibt mir bitte bis zum 3. Juni eine Mail. Wie immer dürfen Noch-Nichtmitglieder auch mitkommen.

Viele Grüße
Flederra ra rattie

by flederrattie at 2019-05-25 00:00

2019-05-23

sECuREs website

Optional dependencies don’t work

In the i3 projects, we have always tried hard to avoid optional dependencies. There are a number of reasons behind it, and as I have recently encountered some of the downsides of optional dependencies firsthand, I summarized my thoughts in this article.

What is a (compile-time) optional dependency?

When building software from source, most programming languages and build systems support conditional compilation: different parts of the source code are compiled based on certain conditions.

An optional dependency is conditional compilation hooked up directly to a knob (e.g. command line flag, configuration file, …), with the effect that the software can now be built without an otherwise required dependency.

Let’s walk through a few issues with optional dependencies.

Inconsistent experience in different environments

Software is usually not built by end users, but by packagers, at least when we are talking about Open Source.

Hence, end users don’t see the knob for the optional dependency, they are just presented with the fait accompli: their version of the software behaves differently than other versions of the same software.

Depending on the kind of software, this situation can be made obvious to the user: for example, if the optional dependency is needed to print documents, the program can produce an appropriate error message when the user tries to print a document.

Sometimes, this isn’t possible: when i3 introduced an optional dependency on cairo and pangocairo, the behavior itself (rendering window titles) worked in all configurations, but non-ASCII characters might break depending on whether i3 was compiled with cairo.

For users, it is frustrating to only discover in conversation that a program has a feature that the user is interested in, but it’s not available on their computer. For support, this situation can be hard to detect, and even harder to resolve to the user’s satisfaction.

Packaging is more complicated

Unfortunately, many build systems don’t stop the build when optional dependencies are not present. Instead, you sometimes end up with a broken build, or, even worse: with a successful build that does not work correctly at runtime.

This means that packagers need to closely examine the build output to know which dependencies to make available. In the best case, there is a summary of available and enabled options, clearly outlining what this build will contain. In the worst case, you need to infer the features from the checks that are done, or work your way through the --help output.

The better alternative is to configure your build system such that it stops when any dependency was not found, and thereby have packagers acknowledge each optional dependency by explicitly disabling the option.

Untested code paths bit rot

Code paths which are not used will inevitably bit rot. If you have optional dependencies, you need to test both the code path without the dependency and the code path with the dependency. It doesn’t matter whether the tests are automated or manual, the test matrix must cover both paths.

Interestingly enough, this principle seems to apply to all kinds of software projects (but it slows down as change slows down): one might think that important Open Source building blocks should have enough users to cover all sorts of configurations.

However, consider this example: building cairo without libxrender results in all GTK application windows, menus, etc. being displayed as empty grey surfaces. Cairo does not fail to build without libxrender, but the code path clearly is broken without libxrender.

Can we do without them?

I’m not saying optional dependencies should never be used. In fact, for bootstrapping, disabling dependencies can save a lot of work and can sometimes allow breaking circular dependencies. For example, in an early bootstrapping stage, binutils can be compiled with --disable-nls to disable internationalization.

However, optional dependencies are broken so often that I conclude they are overused. Read on and see for yourself whether you would rather commit to best practices or not introduce an optional dependency.

Best practices

If you do decide to make dependencies optional, please:

  1. Set up automated testing for all code path combinations.
  2. Fail the build until packagers explicitly pass a --disable flag.
  3. Tell users their version is missing a dependency at runtime, e.g. in --version.

at 2019-05-23 00:00

2019-05-22

michael-herbst.com

Coulomb Sturmian basis functions in electronic structure theory

Earlier this month, on 3rd of May, I was invited for a day to the Laboratoire de Physique et Chimie Théoriques of the Université de Lorraine in Metz. Unlike my earlier stay in Metz stay, during the MES conference, last August, I did not manage to enjoy this beautiful city much. Instead I passed my day with science at the lab and presented at their seminar in a nicely informal setting: Around a communal table, eating brioche and drinking coffee.

I have known my host, Ugo Ancarani, already since the MES. His research mostly revolves around the modelling of collisions, more precisely the interaction of light and matter leading to the ejection of one or more electrons. Similar to my research in molecular problems he employs a Sturmian-based approach to solve the problem. It was quite interesting to discuss similarities and differences between the physics of bound states and of collision problems, a theoretical setting I have not yet spend a lot of time thinking about.

In general I was quite surprised by the large number of research directions, which are dealt with in this lab. The range covers for example quantum dynamics, collision theory, biochemistry, solvation and mixed phase problems. I did not manage to chat with everyone, unfortunately, but I think I managed to catch at least some insight into the research in Metz.

Originating from the relation with Ugo's work I ended up presenting on our recent Coulomb-Sturmian convergence results at HF level with some outlook into future directions. Overall I really enjoyed the atmosphere at the seminar and the lab and I already look forward to anther time when I might return to Metz. As usual my slides are attached below.

Link Licence
Coulomb Sturmian basis functions in electronic structure theory (Slides seminar talk) Creative Commons License

by Michael F. Herbst at 2019-05-22 17:00 under talk, electronic structure theory, Coulomb Sturmians, convergence, theoretical chemistry

2019-04-28

RaumZeitLabor

Der Bembel des Todes - Was Hessen essen... und trinken

Bembelblume

Ei Gude wie, liebe RaumZeitLaboranten?

Habt ihr am 11. Mai 2019 schon was vor? Noch nicht? Perfekt! Dann spült euer schönstes Geripptes, schüttelt die Batschkapp aus und kommt um 18 Uhr in die Boveristraße zum Gehessischen Abend!

Wir werden euch neben kulinarischen Delikathessen, wie unter anderem Rippche mit Kraut und Grie Soß mit Kartoffeln, auch filmische Highlights aus unserem Nachbarbundesland kredenzen. Getreu dem Motto: „An Äppler a day keeps the doctor away“ darf dazu ein gutes Stöffche natürlich auch nicht fehlen.

Seid keine Hannebambel und meldet euch bis zum 8. Mai 2019 per Mail an und schreibt dazu, ob ihr noch n Äbbelboi oder ne Prinzhessin mitbringt. Bitte packt mindestens 10 Euro Schoppegeld pro Kopf am Ende des Abends in die Kasse.

Mer sieht sisch! Eure Bembel-Bobbelsche

by flederrattie at 2019-04-28 00:00

2019-04-17

michael-herbst.com

Julia lightning talk

Yesterday I gave a short lightning talk at the Inria Devmeetup about Julia. My intention was to show a few of the design aspects of this rather recent programming language and motivated from my experiences in my current PostDoc project at the MATHERIALS team hint at interesting ongoing developments. For this I chose to show some (hopefully) representative code examples in a Jupyter notebook, interactively running the Julia code using an IJulia kernel.

For demonstrating Julia's multiple dispatch paradigm, I hinted how functionality (like the mysquare function in my example) can be easily implemented in a way such that code can be combined with multiple different computational backends. Such backends include distributed array storage for large chunks of data or arrays of static size, where the size information may be used at compile-time to speed up the byte code for small problems. GPU backends are possible as well, but I did not go into this in my presentation. With respect to speed I show some timings from a heat equation example (courtesy Antoine Levitt) comparing a Julia, a python and a C implementation. Last but not least I hinted at the interoperability with python packages and Fortran code and showed a plotting example, where I used Zygote to automatically compute and show the gradient of a function using adjoint-mode automatic differentiation.

The notebook I used for the presentation is both attached below and can be found on github as well. For a rendered version of the notebook you can open it in nbviewer.

Link Licence
Julia: A numerical programming language (Notebook lightning talk) GNU GPL v3
View notebook in nbviewer GNU GPL v3

by Michael F. Herbst at 2019-04-17 09:00 under talk, Julia, programming and scripting, Inria Devmeetup

2019-04-12

michael-herbst.com

Coulomb Sturmian basis functions in electronic structure theory

After my first Coulomb Sturmian-related talk at the LCT last November, I was approached by Julien Toulouse and asked whether I could give another talk providing a little more insight into Coulomb Sturmians and their potential.

A few weeks ago, on 29th March, it was thus about time for a second Coulomb Sturmian seminar at LCT. In my recent talk I focused more on stressing the differences between Coulomb Sturmians and Gaussian basis sets and motivated by our convergence results at HF level provide some hints, what could be expected from Coulomb Sturmians in the future. Naturally this last part turned out to be rather far-fetched and speculative and little if any bullet-proof evidence for the mentioned aspects can be given at the moment.

I really enjoyed giving this talk in the arising rather informal setting, where many stimulating questions came up in the discussion with plenty of opportunity for further research. As usual the slides are attached below.

Link Licence
Coulomb Sturmian basis functions in electronic structure theory (Slides seminar talk) Creative Commons License

by Michael F. Herbst at 2019-04-12 09:00 under talk, electronic structure theory, Coulomb Sturmians, convergence, theoretical chemistry

2019-03-21

Insanity Industries

simple wifi setup with iwd and networkd

The iNet Wireless Daemon (or iwd for short) is the new superior tool for managing wireless devices on Linux. For the reason why and other details of iwd shall be left to others.

This post shall give a brief introduction into a simple functional iwd-based setup. Make sure you have all alternative service disabled that could interfere with this setup.

Network setup

Requirements

  • Linux >= 4.20 (at least for things like eduroam or other EAP-wifis)
  • iwd
  • systemd-networkd
  • NetworkManager/Connman/other-GUI-Interface (optional, potentially alternatively to systemd-networkd)

iwd

Start iwd. On systemd-based systems this is most likely just a simple systemctl enable iwd && systemctl start iwd. One can verify that iwd runs by issuing iwctl. If it can connect to iwd, you’ll get a iwctl-shell.

Furthermore, starting with iwd 0.18, you have to create a file /etc/iwd/main.conf containing

[General]
use_default_interface=true

due to iwd’s interface lifecycle handling, otherwise your default interface will be removed by iwd and therefore networkd won’t be able to handle it correctly, at least not when you use the interface’s Name for matching. If you use a [Match] section compatible with iwd’s new interface lifecycle handling, this is not necessary, of course.

systemd-networkd

Create a file /etc/systemd/network/wifi.network containing

[Match]
Name=< name of your wifi interface or * for every interface >

[Network]
DHCP=yes
IPv6PrivacyExtensions=true

and enable systemd-networkd. Now you have iwd bringing your wifi up and systemd-networkd getting you an IP via DHCP on that interface quickly afterwards.

At some point iwd intends to implement DHCP as well, but as of writing this, this is not yet the case and needs to be done by e.g. systemd-networkd.

Remarks to GUIs

If you use NetworkManager, you have to enable the iwd-backend for NetworkManager to use it. In addition to that, double check if you have the right NetworkManager-Version for your iwd version. As iwd has still not reached version 1.0 as of time of writing, the API can still be subject to change if it turns out that things need to be changed to prevent headaches in the future.

Working with it

You can now connect to simple PSK-wifi-networks in the iwctl-shell:

[iwctl] station <devicename> get-networks
…
[iwctl] station <devicename> connect network-name

iwd will ask you for the password, memorize it for later connections and autoconnect the next time the network appears.

file-based config

If you have more complex wifi-setups, you can place a configuration file in /var/lib/iwd. The files must be named as networkname.protocol. You can find the protocol listed in the output of get-networks in the iwctl-shell. To fill the file, take a look to the network configuration settings in the iwd documentation.

Example: eduroam

To get eduroam running (at least for institutions using TTLS as the EAP-Method and MSCHAPv2 for the inner authentication), create the file `/var/lib/iwd/eduroam.8021x containing for example

[Security]
EAP-Method=TTLS
EAP-TTLS-Phase2-Method=MSCHAPV2
EAP-TTLS-CACert=<certificate.pem>
EAP-Identity=<anonymous-identity>
EAP-TTLS-Phase2-Identity=<username>
EAP-TTLS-Phase2-Password=<password>

While MSCHAPv2 has been broken by now, the available alternatives for EAP are based on MD5 or directly send cleartext over the wire (PAP) or similarly bad methods, which unfortunately makes MSCHAPv2 seem to be the best choice if available.

For the University Heidelberg for example, fill the template with

  • <certificate.pem>: /etc/ssl/certs/T-TeleSec_GlobalRoot_Class_2.pem
  • <anonymous-identity>: eduroamHDaoc2019@uni-heidelberg.de
  • <username>: <uni-id>@uni-heidelberg.de
  • <password>: your University account password

Final remarks

You might want to take a look at the previous post to deal with race conditions that might be introduced due to iwd being significantly faster than other wifi-solutions on Linux.

Revision history

13th June 2019:

  • Example above changed from PAP, which is a cleartext protocol, to MSCHAPv2, which is slightly less bad, for inner authentication
  • Updated example configs to reflect changes in eduroam setup for Heidelberg University

20th June 2019:

  • add section about /etc/iwd/main.conf to keep interoperability with the systemd-networkd-config in this post.

12th September 2019:

  • Corrected typo in certificate path

by Jonas Große Sundrup at 2019-03-21 00:00

2019-03-17

michael-herbst.com

ctx: Key-value datastructure for organised hierarchical storage

As an aside from my plane-wave DFT project at CERMICS, I spent some time in the last few weeks polishing up the ctx context library and publishing it via github. The ctx library provides one approach how to tackle a common challenge in larger scientific simulation codes, namely how to organise simulation data.

Often each substep of a larger simulation code by itself already requires a large number of input data in order to produce its results. To increase the flexibility and modularity of the code base, the data/parameters required for one part of the procedure are best hidden from the respective others. Naturally, complete separation cannot be achieved, since the steps do need to exchange some results or input. The question is therefore where and how to find a good balance between accessing data from other simulation steps and shielding them.

Typically, simulation procedures are inherently hierarchical. E.g. the computation of a particular property of an electronic structure might be achieved by solving linear-response equation, which in turn requires to solve a linear system of equations, e.g. by a contraction-based iterative algorithm, which in turn requires the computation of matrix-vector products. In such an hierarchy of algorithms data only needs to flow up or down, but not sideways. This is to say that, e.g. computing a second property via another linear-response equation, might again be using a similar tree of methods, but does not necessarily need access to all individual details of the first tree. This immediately leads to the approach of a tree-like, hierarchical storage for data and parameters, where a simulation substep may only work on its own subtree, but not on any of its parents or siblings.

This is where the library ctx comes in. It offers a C++ implementation of a tree-like string-to-value mapping, called CtxMap, which allows to store data of arbitrary type. In line with above approach, a key in a CtxMap is a path-like string such as /this/is/a/path/to/a/value. By means of functionality such as views into subtrees or C++ iterators over ranges of keys, navigating and accessing the data from the CtxMap from different parts of the code is greatly facilitated. Since all key objects are stored as std::shared_ptr objects, memory safety as well as good integration into the C++ standard library are assured.

Originally I started working on ctx, when I wanted to connect our molsturm package to the adcman code also developed at the Dreuw group in Heidelberg. In the design of ctx I took strong inspirations from both the PamMap and the GenMap data structures I have been playing with, as well as the libctx library by E. Epifanovsky et. al.. The latter code had similar goals of providing hierarchical storage of data in mind and was used for this purpose inside the Q-Chem quantum chemistry code. From this respect I am very happy that ctx has now become the successor of libctx inside Q-Chem, providing ctx with an application in the context of a larger code base.

For further details and the ctx source code, see the ctx project page on github. ctx can be cited using DOI 10.5281/zenodo.2590706.

by Michael F. Herbst at 2019-03-17 23:00 under programming and scripting, C++

2019-03-12

Insanity Industries

Racefree iwd

The iNet Wireless Daemon (or iwd for short) is the new superior tool for managing wireless devices on Linux. For the reason why and other details of iwd shall be left to others.

iwd has at time of writing two major issues:

  • requiring a (to time of writing) comparatively recent Linux version (<= 4.20) due to upstream bugfixes in its crypto subsystem
  • being so fast that it might bring up the wireless interface before in can be renamed by udev

While the first issue will resolve itself over time, the second one is a little harder to tackle.

The Problem

When a Linux system boots up, all the interfaces get initialized. They are named by linux in order of showing up, typically to eth0, eth1, etc. for wired cards and wlan0, wlan1 etc. for wireless cards. Which card gets which identifier depends solely on the order of getting noticed by Linux and it isn’t even guaranteed that wireless cards will be named wlan*, on some systems they also end up with a eth*-indentifier. Therefore every piece of software doing things with interfaces based on their names is at risk of picking the wrong network card at some point.

To tackle this problem, systemd introduced persistent naming. Now your network cards are no longer named eth0, eth1 and wlan0, but enp2s5 or something like that, generated from properties of the network card. Much less beautiful, but persistent and therefore very desirable, fixing the race condition described above – however, doing so by introducing another race condition.

The reason for this is that this rename can only happen when the network interface is down. iwd, however, being much faster than other alternatives (namely wpa_supplicant and everything that’s built on it) might already bring the wireless interface up before udev had the chance to rename the interface, which is then no longer possible, leaving it with the (race condition afflicted) kernel name.

The Solution

To solve this issue, we can delay starting iwd until the rename has happened, removing the second race condition from the mix. To do so we first need to find the name of our wireless interface via ip link, yielding something like

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT ...
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: wlp3s2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN ...
    link/ether 8c:7b:83:7d:04:82 brd ff:ff:ff:ff:ff:ff
3: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode ...
    link/ether e4:2d:9a:bb:e4:f4 brd ff:ff:ff:ff:ff:ff

The wireless device here is named wlp3s2, the wired one is enp4s0. In systemd, both cards get a device-unit upon appearance, for the wireless device this is sys-subsystem-net-devices-wlp3s2.device. They can be found in the unit list one gets from systemctl list-units.

To ensure iwd is only started after the wireless card was renamed to wlp3s2, we just need to make the unit starting it wait for said device-unit. This is easiest accomplished by issuing a systemctl edit iwd.service and entering:

[Unit]
Requires=sys-subsystem-net-devices-wlp3s2.device
After=sys-subsystem-net-devices-wlp3s2.device

systemd will then add those options dynamically to the loaded iwd.service-file when loading it, making it wait for the device wlp3s2 and start after that unit is done setting up, by when the rename has already been performed.

This, of course, requires manual intervention and only works for a single wireless network card, but should suffice for most setups and while not removing the root of the race condition resolves its symptoms with iwd.

by Jonas Große Sundrup at 2019-03-12 00:00

2019-03-10

sECuREs website

Winding down my Debian involvement

This post is hard to write, both in the emotional sense but also in the “I would have written a shorter letter, but I didn’t have the time” sense. Hence, please assume the best of intentions when reading it—it is not my intention to make anyone feel bad about their contributions, but rather to provide some insight into why my frustration level ultimately exceeded the threshold.

Debian has been in my life for well over 10 years at this point.

A few weeks ago, I have visited some old friends at the Zürich Debian meetup after a multi-year period of absence. On my bike ride home, it occurred to me that the topics of our discussions had remarkable overlap with my last visit. We had a discussion about the merits of systemd, which took a detour to respect in open source communities, returned to processes in Debian and eventually culminated in democracies and their theoretical/practical failings. Admittedly, that last one might be a Swiss thing.

I say this not to knock on the Debian meetup, but because it prompted me to reflect on what feelings Debian is invoking lately and whether it’s still a good fit for me.

So I’m finally making a decision that I should have made a long time ago: I am winding down my involvement in Debian to a minimum.

What does this mean?

Over the coming weeks, I will:

  • transition packages to be team-maintained where it makes sense
  • remove myself from the Uploaders field on packages with other maintainers
  • orphan packages where I am the sole maintainer

I will try to keep up best-effort maintenance of the manpages.debian.org service and the codesearch.debian.net service, but any help would be much appreciated.

For all intents and purposes, please treat me as permanently on vacation. I will try to be around for administrative issues (e.g. permission transfers) and questions addressed directly to me, permitted they are easy enough to answer.

Why?

When I joined Debian, I was still studying, i.e. I had luxurious amounts of spare time. Now, over 5 years of full time work later, my day job taught me a lot, both about what works in large software engineering projects and how I personally like my computer systems. I am very conscious of how I spend the little spare time that I have these days.

The following sections each deal with what I consider a major pain point, in no particular order. Some of them influence each other—for example, if changes worked better, we could have a chance at transitioning packages to be more easily machine readable.

Change process in Debian

The last few years, my current team at work conducted various smaller and larger refactorings across the entire code base (touching thousands of projects), so we have learnt a lot of valuable lessons about how to effectively do these changes. It irks me that Debian works almost the opposite way in every regard. I appreciate that every organization is different, but I think a lot of my points do actually apply to Debian.

In Debian, packages are nudged in the right direction by a document called the Debian Policy, or its programmatic embodiment, lintian.

While it is great to have a lint tool (for quick, local/offline feedback), it is even better to not require a lint tool at all. The team conducting the change (e.g. the C++ team introduces a new hardening flag for all packages) should be able to do their work transparent to me.

Instead, currently, all packages become lint-unclean, all maintainers need to read up on what the new thing is, how it might break, whether/how it affects them, manually run some tests, and finally decide to opt in. This causes a lot of overhead and manually executed mechanical changes across packages.

Notably, the cost of each change is distributed onto the package maintainers in the Debian model. At work, we have found that the opposite works better: if the team behind the change is put in power to do the change for as many users as possible, they can be significantly more efficient at it, which reduces the total cost and time a lot. Of course, exceptions (e.g. a large project abusing a language feature) should still be taken care of by the respective owners, but the important bit is that the default should be the other way around.

Debian is lacking tooling for large changes: it is hard to programmatically deal with packages and repositories (see the section below). The closest to “sending out a change for review” is to open a bug report with an attached patch. I thought the workflow for accepting a change from a bug report was too complicated and started mergebot, but only Guido ever signaled interest in the project.

Culturally, reviews and reactions are slow. There are no deadlines. I literally sometimes get emails notifying me that a patch I sent out a few years ago (!!) is now merged. This turns projects from a small number of weeks into many years, which is a huge demotivator for me.

Interestingly enough, you can see artifacts of the slow online activity manifest itself in the offline culture as well: I don’t want to be discussing systemd’s merits 10 years after I first heard about it.

Lastly, changes can easily be slowed down significantly by holdouts who refuse to collaborate. My canonical example for this is rsync, whose maintainer refused my patches to make the package use debhelper purely out of personal preference.

Granting so much personal freedom to individual maintainers prevents us as a project from raising the abstraction level for building Debian packages, which in turn makes tooling harder.

How would things look like in a better world?

  1. As a project, we should strive towards more unification. Uniformity still does not rule out experimentation, it just changes the trade-off from easier experimentation and harder automation to harder experimentation and easier automation.
  2. Our culture needs to shift from “this package is my domain, how dare you touch it” to a shared sense of ownership, where anyone in the project can easily contribute (reviewed) changes without necessarily even involving individual maintainers.

To learn more about how successful large changes can look like, I recommend my colleague Hyrum Wright’s talk “Large-Scale Changes at Google: Lessons Learned From 5 Yrs of Mass Migrations”.

Fragmented workflow and infrastructure

Debian generally seems to prefer decentralized approaches over centralized ones. For example, individual packages are maintained in separate repositories (as opposed to in one repository), each repository can use any SCM (git and svn are common ones) or no SCM at all, and each repository can be hosted on a different site. Of course, what you do in such a repository also varies subtly from team to team, and even within teams.

In practice, non-standard hosting options are used rarely enough to not justify their cost, but frequently enough to be a huge pain when trying to automate changes to packages. Instead of using GitLab’s API to create a merge request, you have to design an entirely different, more complex system, which deals with intermittently (or permanently!) unreachable repositories and abstracts away differences in patch delivery (bug reports, merge requests, pull requests, email, …).

Wildly diverging workflows is not just a temporary problem either. I participated in long discussions about different git workflows during DebConf 13, and gather that there were similar discussions in the meantime.

Personally, I cannot keep enough details of the different workflows in my head. Every time I touch a package that works differently than mine, it frustrates me immensely to re-learn aspects of my day-to-day.

After noticing workflow fragmentation in the Go packaging team (which I started), I tried fixing this with the workflow changes proposal, but did not succeed in implementing it. The lack of effective automation and slow pace of changes in the surrounding tooling despite my willingness to contribute time and energy killed any motivation I had.

Old infrastructure: package uploads

When you want to make a package available in Debian, you upload GPG-signed files via anonymous FTP. There are several batch jobs (the queue daemon, unchecked, dinstall, possibly others) which run on fixed schedules (e.g. dinstall runs at 01:52 UTC, 07:52 UTC, 13:52 UTC and 19:52 UTC).

Depending on timing, I estimated that you might wait for over 7 hours (!!) before your package is actually installable.

What’s worse for me is that feedback to your upload is asynchronous. I like to do one thing, be done with it, move to the next thing. The current setup requires a many-minute wait and costly task switch for no good technical reason. You might think a few minutes aren’t a big deal, but when all the time I can spend on Debian per day is measured in minutes, this makes a huge difference in perceived productivity and fun.

The last communication I can find about speeding up this process is ganneff’s post from 2008.

How would things look like in a better world?

  1. Anonymous FTP would be replaced by a web service which ingests my package and returns an authoritative accept or reject decision in its response.
  2. For accepted packages, there would be a status page displaying the build status and when the package will be available via the mirror network.
  3. Packages should be available within a few minutes after the build completed.

Old infrastructure: bug tracker

I dread interacting with the Debian bug tracker. debbugs is a piece of software (from 1994) which is only used by Debian and the GNU project these days.

Debbugs processes emails, which is to say it is asynchronous and cumbersome to deal with. Despite running on the fastest machines we have available in Debian (or so I was told when the subject last came up), its web interface loads very slowly.

Notably, the web interface at bugs.debian.org is read-only. Setting up a working email setup for reportbug(1) or manually dealing with attachments is a rather big hurdle.

For reasons I don’t understand, every interaction with debbugs results in many different email threads.

Aside from the technical implementation, I also can never remember the different ways that Debian uses pseudo-packages for bugs and processes. I need them rarely enough to establish a mental model of how they are set up, or working memory of how they are used, but frequently enough to be annoyed by this.

How would things look like in a better world?

  1. Debian would switch from a custom bug tracker to a (any) well-established one.
  2. Debian would offer automation around processes. It is great to have a paper-trail and artifacts of the process in the form of a bug report, but the primary interface should be more convenient (e.g. a web form).

Old infrastructure: mailing list archives

It baffles me that in 2019, we still don’t have a conveniently browsable threaded archive of mailing list discussions. Email and threading is more widely used in Debian than anywhere else, so this is somewhat ironic. Gmane used to paper over this issue, but Gmane’s availability over the last few years has been spotty, to say the least (it is down as I write this).

I tried to contribute a threaded list archive, but our listmasters didn’t seem to care or want to support the project.

Debian is hard to machine-read

While it is obviously possible to deal with Debian packages programmatically, the experience is far from pleasant. Everything seems slow and cumbersome. I have picked just 3 quick examples to illustrate my point.

debiman needs help from piuparts in analyzing the alternatives mechanism of each package to display the manpages of e.g. psql(1). This is because maintainer scripts modify the alternatives database by calling shell scripts. Without actually installing a package, you cannot know which changes it does to the alternatives database.

pk4 needs to maintain its own cache to look up package metadata based on the package name. Other tools parse the apt database from scratch on every invocation. A proper database format, or at least a binary interchange format, would go a long way.

Debian Code Search wants to ingest new packages as quickly as possible. There used to be a fedmsg instance for Debian, but it no longer seems to exist. It is unclear where to get notifications from for new packages, and where best to fetch those packages.

Complicated build stack

See my “Debian package build tools” post. It really bugs me that the sprawl of tools is not seen as a problem by others.

Developer experience pretty painful

Most of the points discussed so far deal with the experience in developing Debian, but as I recently described in my post “Debugging experience in Debian”, the experience when developing using Debian leaves a lot to be desired, too.

I have more ideas

At this point, the article is getting pretty long, and hopefully you got a rough idea of my motivation.

While I described a number of specific shortcomings above, the final nail in the coffin is actually the lack of a positive outlook. I have more ideas that seem really compelling to me, but, based on how my previous projects have been going, I don’t think I can make any of these ideas happen within the Debian project.

I intend to publish a few more posts about specific ideas for improving operating systems here. Stay tuned.

Lastly, I hope this post inspires someone, ideally a group of people, to improve the developer experience within Debian.

at 2019-03-10 00:00

2019-02-15

sECuREs website

Debugging experience in Debian

Recently, a user reported that they don’t see window titles in i3 when running i3 on a Raspberry Pi with Debian.

I copied the latest Raspberry Pi Debian image onto an SD card, booted it, and was able to reproduce the issue.

Conceptually, at this point, I should be able to install and start gdb, set a break point and step through the code.

Enabling debug symbols in Debian

Debian, by default, strips debug symbols when building packages to conserve disk space and network bandwidth. The motivation is very reasonable: most users will never need the debug symbols.

Unfortunately, obtaining debug symbols when you do need them is unreasonably hard.

We begin by configuring an additional apt repository which contains automatically generated debug packages:

raspi# cat >>/etc/apt/sources.list.d/debug.list <<'EOT'
deb http://deb.debian.org/debian-debug buster-debug main contrib non-free
EOT
raspi# apt update

Notably, not all Debian packages have debug packages. As the DebugPackage Debian Wiki page explains, debhelper/9.20151219 started generating debug packages (ending in -dbgsym) automatically. Packages which have not been updated might come with their own debug packages (ending in -dbg) or might not preserve debug symbols at all!

Now that we can install debug packages, how do we know which ones we need?

Finding debug symbol packages in Debian

For debugging i3, we obviously need at least the i3-dbgsym package, but i3 uses a number of other libraries through whose code we may need to step.

The debian-goodies package ships a tool called find-dbgsym-packages which prints the required packages to debug an executable, core dump or running process:

raspi# apt install debian-goodies
raspi# apt install $(find-dbgsym-packages $(which i3))

Now we should have symbol names and line number information available in gdb. But for effectively stepping through the program, access to the source code is required.

Obtaining source code in Debian

Naively, one would assume that apt source should be sufficient for obtaining the source code of any Debian package. However, apt source defaults to the package candidate version, not the version you have installed on your system.

I have addressed this issue with the pk4 tool, which defaults to the installed version.

Before we can extract any sources, we need to configure yet another apt repository:

raspi# cat >>/etc/apt/sources.list.d/source.list <<'EOT'
deb-src http://deb.debian.org/debian buster main contrib non-free
EOT
raspi# apt update

Regardless of whether you use apt source or pk4, one remaining problem is the directory mismatch: the debug symbols contain a certain path, and that path is typically not where you extracted your sources to. While debugging, you will need to tell gdb about the location of the sources. This is tricky when you debug a call across different source packages:

(gdb) pwd
Working directory /usr/src/i3.
(gdb) list main
229     * the main loop. */
230     ev_unref(main_loop);
231   }
232 }
233
234 int main(int argc, char *argv[]) {
235  /* Keep a symbol pointing to the I3_VERSION string constant so that
236   * we have it in gdb backtraces. */
237  static const char *_i3_version __attribute__((used)) = I3_VERSION;
238  char *override_configpath = NULL;
(gdb) list xcb_connect
484	../../src/xcb_util.c: No such file or directory.

See Specifying Source Directories in the gdb manual for the dir command which allows you to add multiple directories to the source path. This is pretty tedious, though, and does not work for all programs.

Positive example: Fedora

While Fedora conceptually shares all the same steps, the experience on Fedora is so much better: when you run gdb /usr/bin/i3, it will tell you what the next step is:

# gdb /usr/bin/i3
[…]
Reading symbols from /usr/bin/i3...(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install i3-4.16-1.fc28.x86_64

Watch what happens when we run the suggested command:

# dnf debuginfo-install i3-4.16-1.fc28.x86_64
enabling updates-debuginfo repository
enabling fedora-debuginfo repository
[…]
Installed:
  i3-debuginfo.x86_64 4.16-1.fc28
  i3-debugsource.x86_64 4.16-1.fc28
Complete!

A single command understood our intent, enabled the required repositories and installed the required packages, both for debug symbols and source code (stored in e.g. /usr/src/debug/i3-4.16-1.fc28.x86_64). Unfortunately, gdb doesn’t seem to locate the sources, which seems like a bug to me.

One downside of Fedora’s approach is that gdb will only print all required dependencies once you actually run the program, so you may need to run multiple dnf commands.

In an ideal world

Ideally, none of the manual steps described above would be necessary. It seems absurd to me that so much knowledge is required to efficiently debug programs in Debian. Case in point: I only learnt about find-dbgsym-packages a few days ago when talking to one of its contributors.

Installing gdb should be all that a user needs to do. Debug symbols and sources can be transparently provided through a lazy-loading FUSE file system. If our build/packaging infrastructure assured predictable paths and automated debug symbol extraction, we could have transparent, quick and reliable debugging of all programs within Debian.

NixOS’s dwarffs is an implementation of this idea: https://github.com/edolstra/dwarffs

Conclusion

While I agree with the removal of debug symbols as a general optimization, I think every Linux distribution should strive to provide an entirely transparent debugging experience: you should not even have to know that debug symbols are not present by default. Debian really falls short in this regard.

Getting Debian to a fully transparent debugging experience requires a lot of technical work and a lot of social convincing. In my experience, programmatically working with the Debian archive and packages is tricky, and ensuring that all packages in a Debian release have debug packages (let alone predictable paths) seems entirely unachievable due to the fragmentation of packaging infrastructure and holdouts blocking any progress.

My go-to example is rsync’s debian/rules, which intentionally (!) still has not adopted debhelper. It is not a surprise that there are no debug symbols for rsync in Debian.

at 2019-02-15 00:00

2019-02-12

Insanity Industries

Self-hosting Threema Safe

What is Threema Safe?

Threema Safe is the messenger Threema’s solution to back up your encryption ID and other things so they don’t get lost once either you decide to smash your phone against a nearby anvil or the phone itself decides to die unexpectedly. From this backup you can restore the aforementioned things into a new Threema-installation.

Threema Safe can also be selfhosted.

How does it work

Threema Safe regularly backs things up into a WebDAV-directory of choice. By default the Threema-Server is configured for this, but any WebDAV-server, from a standard webserver to a Nexcloud, can be configured as a backup target.

Server-side setup

As a standard webserver can be used as a backup target, we will show the necessary configuration for an nginx to act as a valid backup server for Threema Safe.

The main configuration file of the server, /etc/nginx.conf, reads

user www-data;
worker_processes 4;
pid /run/nginx.pid;

http {

    # standard stuff
    include /etc/nginx/mime.types;
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    ssl_dhparam /path/to/dhparams;

    server_name some.arbitrary.domain;

    server {

        listen 443 ssl default_server;      # ipv4
        listen [::]:443 ssl default_server; # ipv6

        location / {
            root    /path/to/webdav/contents;
            client_body_temp_path /tmp;
            dav_methods     PUT DELETE MKCOL COPY MOVE;

            create_full_put_path  on;
            dav_access    user:rw;
            autoindex    on;

            # if desired, a htpasswd-file can be provided to require
            # authentication in front of the nginx
            auth_basic "you shall not pass";
            auth_basic_user_file /path/to/htpasswd;
        }
    }
}

In the root of the WebDAV-folder create a folder (optional) with two things present (not optional):

  • a file config
  • a directory backups that must be writable (and likely readable for restore)

The config-file reads:

{
   "maxBackupBytes": 524288,
   "retentionDays": 180
}

This concludes the server side backup. If you want to verify that the WebDAV works correctly, you can browse it with your browser to see if it responds correctly.

# tree /path/to/webdav/contents
/path/to/webdav/contents
└── threema_safe
    ├── backups
    │   └── b5bb9d8014a0f9b1d only after first backup 12f4850b878ae4944c
    └── config

Client side setup

In Threema, go to the menu and select “My backups”. Activate Threema Safe and tap on “Expert Settings”, there you can enter your custom server. In our example, this would be https://some.arbitrary.domain/threema_safe. If you configured your nginx to require authentication, specify your access credentials. Then tap “Backup now” and you should be good to go!

by Jonas Große Sundrup at 2019-02-12 00:00

2019-02-05

sECuREs website

TurboPFor: an analysis

Motivation

I have recently been looking into speeding up Debian Code Search. As a quick reminder, search engines answer queries by consulting an inverted index: a map from term to documents containing that term (called a “posting list”). See the Debian Code Search Bachelor Thesis (PDF) for a lot more details.

Currently, Debian Code Search does not store positional information in its index, i.e. the index can only reveal that a certain trigram is present in a document, not where or how often.

From analyzing Debian Code Search queries, I knew that identifier queries (70%) massively outnumber regular expression queries (30%). When processing identifier queries, storing positional information in the index enables a significant optimization: instead of identifying the possibly-matching documents and having to read them all, we can determine matches from querying the index alone, no document reads required.

This moves the bottleneck: having to read all possibly-matching documents requires a lot of expensive random I/O, whereas having to decode long posting lists requires a lot of cheap sequential I/O.

Of course, storing positions comes with a downside: the index is larger, and a larger index takes more time to decode when querying.

Hence, I have been looking at various posting list compression/decoding techniques, to figure out whether we could switch to a technique which would retain (or improve upon!) current performance despite much longer posting lists and produce a small enough index to fit on our current hardware.

Literature

I started looking into this space because of Daniel Lemire’s Stream VByte post. As usual, Daniel’s work is well presented, easily digestible and accompanied by not just one, but multiple implementations.

I also looked for scientific papers to learn about the state of the art and classes of different approaches in general. The best I could find is Compression, SIMD, and Postings Lists. If you don’t have access to the paper, I hear that Sci-Hub is helpful.

The paper is from 2014, and doesn’t include all algorithms. If you know of a better paper, please let me know and I’ll include it here.

Eventually, I stumbled upon an algorithm/implementation called TurboPFor, which the rest of the article tries to shine some light on.

TurboPFor

If you’re wondering: PFor stands for Patched Frame Of Reference and describes a family of algorithms. The principle is explained e.g. in SIMD Compression and the Intersection of Sorted Integers (PDF).

The TurboPFor project’s README file claims that TurboPFor256 compresses with a rate of 5.04 bits per integer, and can decode with 9400 MB/s on a single thread of an Intel i7-6700 CPU.

For Debian Code Search, we use unsigned integers of 32 bit (uint32), which TurboPFor will compress into as few bits as required.

Dividing Debian Code Search’s file sizes by the total number of integers, I get similar values, at least for the docid index section:

  • 5.49 bits per integer for the docid index section
  • 11.09 bits per integer for the positions index section

I can confirm the order of magnitude of the decoding speed, too. My benchmark calls TurboPFor from Go via cgo, which introduces some overhead. To exclude disk speed as a factor, data comes from the page cache. The benchmark sequentially decodes all posting lists in the specified index, using as many threads as the machine has cores¹:

  • ≈1400 MB/s on a 1.1 GiB docid index section
  • ≈4126 MB/s on a 15.0 GiB position index section

I think the numbers differ because the position index section contains larger integers (requiring more bits). I repeated both benchmarks, capped to 1 GiB, and decoding speeds still differed, so it is not just the size of the index.

Compared to Streaming VByte, a TurboPFor256 index comes in at just over half the size, while still reaching 83% of Streaming VByte’s decoding speed. This seems like a good trade-off for my use-case, so I decided to have a closer look at how TurboPFor works.

① See cmd/gp4-verify/verify.go run on an Intel i9-9900K.

Methodology

To confirm my understanding of the details of the format, I implemented a pure-Go TurboPFor256 decoder. Note that it is intentionally not optimized as its main goal is to use simple code to teach the TurboPFor256 on-disk format.

If you’re looking to use TurboPFor from Go, I recommend using cgo. cgo’s function call overhead is about 51ns as of Go 1.8, which will easily be offset by TurboPFor’s carefully optimized, vectorized (SSE/AVX) code.

With that caveat out of the way, you can find my teaching implementation at https://github.com/stapelberg/goturbopfor

I verified that it produces the same results as TurboPFor’s p4ndec256v32 function for all posting lists in the Debian Code Search index.

On-disk format

Note that TurboPFor does not fully define an on-disk format on its own. When encoding, it turns a list of integers into a byte stream:

size_t p4nenc256v32(uint32_t *in, size_t n, unsigned char *out);

When decoding, it decodes the byte stream into an array of integers, but needs to know the number of integers in advance:

size_t p4ndec256v32(unsigned char *in, size_t n, uint32_t *out);

Hence, you’ll need to keep track of the number of integers and length of the generated byte streams separately. When I talk about on-disk format, I’m referring to the byte stream which TurboPFor returns.

The TurboPFor256 format uses blocks of 256 integers each, followed by a trailing block — if required — which can contain fewer than 256 integers:

SIMD bitpacking is used for all blocks but the trailing block (which uses regular bitpacking). This is not merely an implementation detail for decoding: the on-disk structure is different for blocks which can be SIMD-decoded.

Each block starts with a 2 bit header, specifying the type of the block:

Each block type is explained in more detail in the following sections.

Note that none of the block types store the number of elements: you will always need to know how many integers you need to decode. Also, you need to know in advance how many bytes you need to feed to TurboPFor, so you will need some sort of container format.

Further, TurboPFor automatically choses the best block type for each block.

Constant block

A constant block (all integers of the block have the same value) consists of a single value of a specified bit width ≤ 32. This value will be stored in each output element for the block. E.g., after calling decode(input, 3, output) with input being the constant block depicted below, output is {0xB8912636, 0xB8912636, 0xB8912636}.

The example shows the maximum number of bytes (5). Smaller integers will use fewer bytes: e.g. an integer which can be represented in 3 bits will only use 2 bytes.

Bitpacking block

A bitpacking block specifies a bit width ≤ 32, followed by a stream of bits. Each value starts at the Least Significant Bit (LSB), i.e. the 3-bit values 0 (000b) and 5 (101b) are encoded as 101000b.

Bitpacking with exceptions (bitmap) block

The constant and bitpacking block types work well for integers which don’t exceed a certain width, e.g. for a series of integers of width ≤ 5 bits.

For a series of integers where only a few values exceed an otherwise common width (say, two values require 7 bits, the rest requires 5 bits), it makes sense to cut the integers into two parts: value and exception.

In the example below, decoding the third integer out2 (000b) requires combination with exception ex0 (10110b), resulting in 10110000b.

The number of exceptions can be determined by summing the 1 bits in the bitmap using the popcount instruction.

Bitpacking with exceptions (variable byte)

When the exceptions are not uniform enough, it makes sense to switch from bitpacking to a variable byte encoding:

Decoding: variable byte

The variable byte encoding used by the TurboPFor format is similar to the one used by SQLite, which is described, alongside other common variable byte encodings, at github.com/stoklund/varint.

Instead of using individual bits for dispatching, this format classifies the first byte (b[0]) into ranges:

  • [0—176]: the value is b[0]
  • [177—240]: a 14 bit value is in b[0] (6 high bits) and b[1] (8 low bits)
  • [241—248]: a 19 bit value is in b[0] (3 high bits), b[1] and b[2] (16 low bits)
  • [249—255]: a 32 bit value is in b[1], b[2], b[3] and possibly b[4]

Here is the space usage of different values:

  • [0—176] are stored in 1 byte (as-is)
  • [177—16560] are stored in 2 bytes, with the highest 6 bits added to 177
  • [16561—540848] are stored in 3 bytes, with the highest 3 bits added to 241
  • [540849—16777215] are stored in 4 bytes, with 0 added to 249
  • [16777216—4294967295] are stored in 5 bytes, with 1 added to 249

An overflow marker will be used to signal that encoding the values would be less space-efficient than simply copying them (e.g. if all values require 5 bytes).

This format is very space-efficient: it packs 0-176 into a single byte, as opposed to 0-128 (most others). At the same time, it can be decoded very quickly, as only the first byte needs to be compared to decode a value (similar to PrefixVarint).

Decoding: bitpacking

Regular bitpacking

In regular (non-SIMD) bitpacking, integers are stored on disk one after the other, padded to a full byte, as a byte is the smallest addressable unit when reading data from disk. For example, if you bitpack only one 3 bit int, you will end up with 5 bits of padding.

SIMD bitpacking (256v32)

SIMD bitpacking works like regular bitpacking, but processes 8 uint32 little-endian values at the same time, leveraging the AVX instruction set. The following illustration shows the order in which 3-bit integers are decoded from disk:

In Practice

For a Debian Code Search index, 85% of posting lists are short enough to only consist of a trailing block, i.e. no SIMD instructions can be used for decoding.

The distribution of block types looks as follows:

  • 72% bitpacking with exceptions (bitmap)
  • 19% bitpacking with exceptions (variable byte)
  • 5% constant
  • 4% bitpacking

Constant blocks are mostly used for posting lists with just one entry.

Conclusion

The TurboPFor on-disk format is very flexible: with its 4 different kinds of blocks, chances are high that a very efficient encoding will be used for most integer series.

Of course, the flip side of covering so many cases is complexity: the format and implementation take quite a bit of time to understand — hopefully this article helps a little! For environments where the C TurboPFor implementation cannot be used, smaller algorithms might be simpler to implement.

That said, if you can use the TurboPFor implementation, you will benefit from a highly optimized SIMD code base, which will most likely be an improvement over what you’re currently using.

at 2019-02-05 08:00

2019-01-26

RaumZeitLabor

All we hear is … Radio Regenbogen

Achtung! Achtung! Hier Sendestelle Käfertal!

Hört her, Herzensfreunde himmlischer Hörfunksendungen!

Anhänger analoger Amplitudenmodulation, aufgepasst!

Ihr fragt euch auch seit 1980, ob Video wirklich den Radio Star gekilled hat?

Ob man aus quadratischen Radios auch Rundfunk hören kann?

Dann seid ihr genau richtig bei der nächsten RZL-Reise zu Radio Regenbogen!

Am Donnerstag, den 28. Februar 2019, werden wir ab 16 Uhr eine Runde durch das Studio in Mannheim drehen. Wir werden Gelegenheit haben, die Musikredaktion, die Sendestudios und die Tontechnik zu besichtigen und alle Fragen zu stellen, die uns auf den Nägeln brennen.

Da diese Exkursion auf 15 Plätze limitiert ist, bitte ich um Anmeldung bis zum 14. Februar 2019 per Mail. Wie immer dürfen auch (Noch-) Nicht-Mitglieder teilnehmen.

Es grüßt

das Flederradio

P.S.: Und bitte nicht vergessen, die Antenne zu erden!

by flederrattie at 2019-01-26 00:00