Planet NoName e.V.

2021-03-06

sECuREs website

Debian Code Search: OpenAPI now available

Debian Code Search now offers an OpenAPI-based API!

Various developers have created ad-hoc client libraries based on how the web interface works.

The goal of offering an OpenAPI-based API is to provide developers with automatically generated client libraries for a large number of programming languages, that target a stable interface independent of the web interface’s implementation details.

Getting started

  1. Visit https://codesearch.debian.net/apikeys/ to download your personal API key. Login via Debian’s GitLab instance salsa.debian.org; register there if you have no account yet.

  2. Find the Debian Code Search client library for your programming language. If none exists yet, auto-generate a client library on editor.swagger.io: click “Generate Client”.

  3. Search all code in Debian from your own analysis tool, migration tracking dashboard, etc.

curl example

curl \
  -H "x-dcs-apikey: $(cat dcs-apikey-stapelberg.txt)" \
  -X GET \
  "https://codesearch.debian.net/api/v1/search?query=i3Font&match_mode=regexp" 

Web browser example

You can try out the API in your web browser in the OpenAPI documentation.

Code example (Go)

Here’s an example program that demonstrates how to set up an auto-generated Go client for the Debian Code Search OpenAPI, run a query, and aggregate the results:

func burndown() error {
	cfg := openapiclient.NewConfiguration()
	cfg.AddDefaultHeader("x-dcs-apikey", apiKey)
	client := openapiclient.NewAPIClient(cfg)
	ctx := context.Background()

	// Search through the full Debian Code Search corpus, blocking until all
	// results are available:
	results, _, err := client.SearchApi.Search(ctx, "fmt.Sprint(err)", &openapiclient.SearchApiSearchOpts{
		// Literal searches are faster and do not require escaping special
		// characters, regular expression searches are more powerful.
		MatchMode: optional.NewString("literal"),
	})
	if err != nil {
		return err
	}

	// Print to stdout a CSV file with the path and number of occurrences:
	wr := csv.NewWriter(os.Stdout)
	header := []string{"path", "number of occurrences"}
	if err := wr.Write(header); err != nil {
		return err
	}
	occurrences := make(map[string]int)
	for _, result := range results {
		occurrences[result.Path]++
	}
	for _, result := range results {
		o, ok := occurrences[result.Path]
		if !ok {
			continue
		}
		// Print one CSV record per path:
		delete(occurrences, result.Path)
		record := []string{result.Path, strconv.Itoa(o)}
		if err := wr.Write(record); err != nil {
			return err
		}
	}
	wr.Flush()
	return wr.Error()
}

The full example can be found under burndown.go.

Feedback?

File a GitHub issue on github.com/Debian/dcs please!

Migration status

I’m aware of the following third-party projects using Debian Code Search:

Tool Migration status
Debian Code Search CLI tool Updated to OpenAPI
identify-incomplete-xs-go-import-path Update pending
gnome-codesearch makes no API queries

If you find any others, please point them to this post in case they are not using Debian Code Search’s OpenAPI yet.

at 2021-03-06 10:15

2021-03-04

michael-herbst.com

PostDoc position at Appl. & Comput. Mathematics lab, RWTH Aachen University

This week I started my new position as a postdocotoral researcher at the Applied and Computational Mathematics (ACoM) research lab at RWTH Aachen University. The lab consists of two interdisciplinary research groups, namely the group of Prof. Dr. Manuel Torrilhon, who works on the mathematical modelling and simulation of technical processes (e.g. plasma or gas flow processes) as well as the group of Prof. Dr. Benjamin Stamm, which I am now joining. Ben's research focus is the numerical analysis of PDEs and linear algebra problems, which arise e.g. in electrostatics or quantum chemistry. This includes principle questions related to eigenvalue problems, but also concrete applications such as improving the performance of the polarisable continuum model, a standard solvation model in electronic structure theory. I see a good fit between our respective research backgrounds and I am happy for this opportunity to extend my research horizon and contribute to the research in Ben's group and the ACoM over the next years.

During my time in Aachen Ben and I want to continue to work on the numerical analysis and the development of mathematically-motivated methods for density-functional theory (DFT). One aspect we have in mind, for example, is to port Ben's recent work for constructing good initial guesses for the self-consistent iterations in molecular DFT (and Gaussian basis functions) to plane-wave DFT. As part of this research we will make use and extend the density-functional toolkit (DFTK), the density-functional theory code I started in Paris. Beyond our work at the ACoM I expect DFTK and its suitability for multidisciplinary research also to be helpful for reaching out to other researchers in the mathematics, computer science and physics departments in Aachen. In particular I see a good fit of DFTK within the JARA-CSD, a joint research initiative between RWTH Aachen and the Jülich research centre.

Having known Aachen already a little from my previous visits in Ben's group I am very much looking forward to work here. Not only is the city very pretty and welcoming, but also the interdisciplinary orientation of RWTH Aachen resonates well with me. I'm looking forward to the many interesting discussions to come and to becoming part of strengthening the interdisciplinary links in Aachen, while at the same time continuing to work at the boundary of chemistry, physics and mathematics.

by Michael F. Herbst at 2021-03-04 23:00 under electronic structure theory, DFT, solid state

2021-03-02

Insanity Industries

Pareto-optimal compression

Data compression is an incredibly useful tool in many situations, be it for backups, archiving, or even filesystems. But of the many compressors that there are, which of them is the one to be preferred? To properly answer this question, we first have to answer a different, but related one: What exactly makes a compression algorithm good?

Time for a closer investigation to find the best of the best of the best!

Pareto optimality or what makes a compression algorithm good

Pareto optimality is a situation where no […] preference criterion can be better off without making at least one […] preference criterion worse off […].

The concept of Pareto Optimality is tremendously useful, far beyond the scope of this blogpost, by acknowledging competing interests.

In our particular case of optimal compression, one of the obvious candidates is compression size. The smaller the resulting file is, the better the compressor, right?

But what if we are just compressing so we can speed up upload to a remote location? At this point, it doesn’t matter if the compression is supremely superior to everything else if we have to wait twice as long for it to happen compared to simply upload the uncompressed file.

So for our algorithms, we have at least two criteria that come into play: actual achievable compression and compression cost, which we will measure as “how long does it take to compress our content?"1. These two are the criteria this blogpost is focused on.

A practical example

Practically speaking, let’s assume two compression tools A and B, having both two compression levels 1 and 2, which we use on a sample file. The results might look like the following:

algorithm level file size time
A 1 60% 2s
A 2 50% 4s
B 1 40% 3s
B 2 30% 11s

We find that while both algorithms increase their compression size with level, B.1 is unconditionally better than A.2, so there is no reason to ever use A.2 if B.1 is available. Besides that, A.1, B.1 and B.2 continuously increase in compression effectiveness as well as time taken for that. For easier digestion, we can visualize these results:

Compression results for different hypothetical compression algorithms, including the Pareto frontier indicated in blue.

Compression results for different hypothetical compression algorithms, including the Pareto frontier indicated in blue.

Here we clearly see what could already be taken from the table above, but significantly more intuitive. Remembering our definition of Pareto optimality from before, we see that A.2 is not Pareto optimal, as B.1 is both better in time taken as well as in compression effect. This shows more intuitively that in this scenario there is no reason to use A.2 if B.1 is available. For A.1, B.1 and B.2 the choice is not so clear-cut, as the resulting file size can only be reduced further by investing more time into the compression. Hence, all three of them are Pareto optimal and constitute the Pareto frontier or the Pareto set.

One wants to always strive for choosing a Pareto optimal solution whenever possible, as non-Pareto-optimal solutions are always to some degree wasteful.

With this insight into Pareto optimality, we can now put our knowledge into practice and begin our journey to the Pareto frontiers of compression algorithms.

Setup and test procedure for real-world measurements

Data was gathered on a Linux system with a given sample file for the resulting filesize compared to the original as well as the time the process took for compressing. First, the sample file was read from disk entirely, ensuring it would be present in Linux' filesystem cache, then compressed with each compression program at each level 15 times for a decent statistic, the compressed result is routed directly into /dev/null. All presented compression times presented are the median of these 15 runs, compression size was only measured once after verifying that it was deterministic2.

Unless explicitly denoted otherwise, all compressors were run in their default configuration with a single compression thread. Furthermore, all applications on the machine used for benchmarking except the terminal holding the benchmarking process were closed to reduce interference by other processes as much as possible.

The tests for decompression were done analogously (with the obvious exception that the sample file is compressed according to the testcase in play beforehand), single-threaded decompression as well as routing the decompressed result to /dev/null.

All tests were run on several different CPUs: Intel i7-8550U, Intel i5-3320M3 (both mobile CPUs), Intel i7-7700K and AMD Ryzen 7 3800X (both workstation CPUs). While the absolute compression times changed, the Pareto frontiers did not, hence in this blogpost only the plots for the i7-8550U are shown exemplarily.

Case study: initramfs-compression

Just recently, the current maintainer of mkinitcpio, Giancarlo Razzolini, announced the transition from gzip to zstd as the default for initramfs-compression. The initramfs is a little mini-system, whose only objective is to prepare the hardware and assemble all disks so that the main system is readily prepared to take over operation. The initramfs needs to be loaded (and hence decompressed) at every boot as well as recreated occasionally, when components contained in it are updated.

mkinitcpio supports a variety of compression algorithms: none, gzip, bzip2, lzma, lzop, lz4 and most recently zstd.

Quantifying these algorithms and plotting the Pareto frontier for compression yields:

Compression results for a standard mkinitcpio-generated initramfs and the corresponding Pareto frontier. The difficult-to-decipher black culmination at about 34% resulting file size are the bzip2 results.

Compression results for a standard mkinitcpio-generated initramfs and the corresponding Pareto frontier. The difficult-to-decipher black culmination at about 34% resulting file size are the bzip2 results.

We find that the change of defaults from gzip to zstd was well justified, as gzip can no longer be considered Pareto optimal for this type of file. Choosing lzma as the default would make it even smaller, but this would be paid by a noticable higher resource usage for compression (which has to be invested on every update affecting the initramfs), so from the data zstd is certainly the wiser choice4.

This can also be seen when we take a look of Pareto optimality of decompression (after all, this decompression needs to happen on every single system boot):

Decompression results for a standard mkinitcpio-generated initramfs and the corresponding Pareto frontier.

Decompression results for a standard mkinitcpio-generated initramfs and the corresponding Pareto frontier.

We clearly see that zstd is blasting all other algorithms out of the water when it comes to decompression speed, making it even more of a good choice for this use case. Given these numbers, it is even more of a good choice for an initramfs, not only does it compress fast, it also decompresses impressively fast, six times faster than lzma which was previously known for its quick decompression speed despite high compression factors.

Given the data for zstd, it cannot be ruled out completely that zstd simply hit a non-CPU-bound on decompression, but even if it did, the conclusion for the choice of algorithm does not change.

dracut

If you happen to be on a Linux distribution that uses dracut for initramfs-generation, the conclusions that can be drawn for dracut-initramfs-compressions are the almost identical given the compression and decompression data for a dracut-initrd, the Pareto frontier remains mostly unchanged with just some more levels of zstd in it.

Real world usage recommendation

To use zstd in mkinitcpio, simply use version 30 and above. You can modify the compression level (default is -3) by adding a specific level to COMPRESSION_OPTIONS, but given the data, this doesn’t seem to provide much of a benefit.

For dracut, add compress="zstd" to /etc/dracut.conf to get zstd compression at the default level.

Case study: borgbackup

In the next scenario we will investigate the impact of compression when backing up data with borgbackup. Borg comes with an integrated option to compress the backupped data, with algorithms available being lz4, zstd, zlib/gzip and lzma. In addition to this, borg has an automated detection routine to see if the files backupped do actually compress enough to spend CPU cycles on compressing (see borg help compression for details).

For this scenario, we do not only have one definite sample, but will consider three different samples: A simple text file, being a dump of the Kernel log via the dmesg command, representing textual data. A binary, in this particular case the dockerd ELF binary, (rather arbitrarily) representing binary data. Finally, an mkinitcpio-image, intended as somewhat of a mixed-data sample. We do not consider media files, as these are typically already stored in compressed formats, hence unlikely to compress further and thus dealt with by borg’s compressibility heuristics.

The resulting Pareto frontiers for compression are, starting with least effective compression:

text binary initramfs
lz4.2 lz4.1 lz4.1
zstd.1 zstd.1 zstd.1
zstd.2 zstd.2 zstd.2
zstd.4 zstd.3 zstd.3
zstd.3 zstd.4 zstd.4
zstd.5 zstd.5 zstd.5
zstd.6 zstd.6 zstd.6
zstd.7 zstd.7 zstd.7
zstd.8 zstd.8 zstd.8
zstd.9 zstd.9 zstd.9
zstd.10 lzma.0 lzma.0
zstd.11 lzma.1 lzma.1
zstd.12 lzma.2 lzma.2
lzma.2 lzma.3 lzma.3
lzma.3 lzma.4 lzma.4
lzma.6 lzma.5 lzma.5
zstd.19 lzma.6 lzma.6
. lzma.7 lzma.7
. lzma.8 lzma.8
. lzma.9 lzma.9

We see that effectively, except for some brief occurrance of lz4 at the top, the relevant choices are lzma and zstd. More details can be seen in the plots linked in the column headers. Hence, as backups should be run often (with a tool as borg there is little reason for anything else but daily), zstd with a level slightly below 10 seems to be a good compromise of speed and resulting data size.

Real world usage recommendation

Adding --compression=auto,zstd,7 to the borg command used to create a backup will use zstd on level 7 if borgs internal heuristics considers the file in question to compress well, otherwise no compression will be used.

This flag can be added on-the fly, without affecting existing repositories or borgs deduplication. Already backupped data is not recompressed, meaning that adding this flag for use with an existing repository does not require a reupload of everything. Consequentially, it also means that to recompress the entire repository with zstd one effectively has to start from scratch.

Case study: Archiving with tar

Compression can not only be used for backup-tools like borg, it can also be used to archive files with tar. Some compressors have explicit flags in tar, such as gzip (-z), lzma (-J) or bzip2 (-j), but any compression algorithm can be used via tar’s -I flag.

Working with tar poses two challenges with regard to the compressor:

  • streaming: tar concatenates data and streams the result through the compressor. Hence, to extract files at a certain position of a tarball, the entirety of the data before that file needs to be decompressed as well.
  • compress-only: as a further consequence of this, tar lacks a feature testing if something is well compressible, so uncompressible data will also be sent through the compressor

If we want to investigate a good compressor without consideration for the input data, we aim for picking a Pareto optimal compressor that takes two properties into consideration:

  • fast decompression, to be able to easily and quickly extract data from the archive again if need be
  • good performance on non-compressible data (this particularly means that incompressible data should especially not increase in size).

Compression capabilities

To investigate the decompression capabilities of certain compressors, we can reuse the dataset used on the borg case study and add some incompressible data in form of a flac music file to the mix. As tar has a larger variety of usable algorithms, we include lz4, gzip, lzma, zstd, lzop and brotli as well.

We can exemplarily see the effectiveness of these on the dockerd elf binary (other datasets can be found below) first:

Compression results and the corresponding Pareto frontier for the dockerd elf binary. The unreadable black cluster at 25% compressed size is again bzip2, the one at 34% is predominantly lz4.

Compression results and the corresponding Pareto frontier for the dockerd elf binary. The unreadable black cluster at 25% compressed size is again bzip2, the one at 34% is predominantly lz4.

Decompression results and the corresponding Pareto frontier for the dockerd elf binary. The clusters are lzma for 20% resulting size, zstd at 25%, brotli at 24% and lz4 at 33%.

Decompression results and the corresponding Pareto frontier for the dockerd elf binary. The clusters are lzma for 20% resulting size, zstd at 25%, brotli at 24% and lz4 at 33%.

In summary, the Pareto frontiers for the different types of data overall turn out to be (with c for compression and d for decompression):

text (c) text (d) binary (c) binary (d) initramfs (c) initramfs (d) flac (c) flac (d)
lz4.2 lz4.9 lz4.1 lz4.8 lz4.1 lz4.7 lz4.2 zstd.1
zstd.1 zstd.13 lzop.1 lz4.9 lzop.6 lz4.9 brotli.1 zstd.5
zstd.2 zstd.11 lzop.3 zstd.15 zstd.1 zstd.1 brotli.2 zstd.7
zstd.4 zstd.12 zstd.1 zstd.16 zstd.2 zstd.9 brotli.3 zstd.19
zstd.3 zstd.14 zstd.2 zstd.17 zstd.3 zstd.14 zstd.18 .
zstd.5 zstd.15 zstd.3 zstd.19 zstd.4 zstd.15 zstd.19 .
zstd.6 zstd.19 zstd.4 lzma.8 zstd.5 zstd.16 . .
zstd.7 . zstd.5 lzma.9 zstd.6 zstd.17 . .
zstd.8 . zstd.6 . zstd.7 zstd.18 . .
zstd.9 . zstd.7 . zstd.8 zstd.19 . .
zstd.10 . zstd.8 . zstd.9 lzma.8 . .
brotli.5 . zstd.9 . brotli.5 lzma.9 . .
brotli.6 . brotli.5 . lzma.0 . . .
brotli.7 . lzma.1 . lzma.1 . . .
lzma.3 . lzma.2 . lzma.2 . . .
lzma.6 . lzma.3 . lzma.3 . . .
zstd.19 . lzma.4 . lzma.4 . . .
. . lzma.5 . lzma.5 . . .
. . lzma.6 . lzma.6 . . .
. . lzma.7 . lzma.7 . . .
. . lzma.8 . lzma.8 . . .
. . lzma.9 . lzma.9 . . .

As can be seen in the linked plots, we again find that while lzma still achieves the highest absolute compression, zstd dominates the sweet spot right before computational cost skyrockets. We also find brotli to be an interesting contender here, making it into the Pareto frontier as well. However, with only sometimes making it into the Pareto frontier, whereas lzma and zstd robustly defend their inclusion in it, it seems more advisable to resort to either lzma or zstd as this only provides a sample binary and actual data might vary. Furthermore, when it comes to decompression brotli is not Pareto optimal anymore at all, also indicating lzma and zstd as being the better choice.

Impact on incompressible files in detail

We will take another closer look at the incompressible case, represented by a flac file and strip away bzip2 and lzma, as we could tell from the linked plots that these two clearly increase the size of the result and are hence not Pareto optimal (as they are already beaten by the Pareto optimal case “no compression”).

The results have a clear indication:

Compression results on incompressible data.

Compression results on incompressible data.

Decompression results on incompressible data. The unreadable black cluster at 99.975% size is zstd, the one at 99.988% contains lzop and brotli.

Decompression results on incompressible data. The unreadable black cluster at 99.975% size is zstd, the one at 99.988% contains lzop and brotli.

The recommended choice of algorithm for compression is either brotli or zstd, but when it comes to decompression, zstd takes the lead again. This is of course a cornercase, the details of this might change with the particular choice of incompressible data. However, I do not expect the overall impression to significantly change.

Real world usage recommendation

Concluding this section, the real-world recommendation resulting from this seems to simply use zstd for any tarball compression if available. To do so, tar "-Izstd -10 -T0" can be a good choice, with -T0 telling zstd to parallelize the compression onto all available CPU cores, speeding things up even more beyond our measurements. Depending on your particular usecase it might be interesting to use an alias like

alias archive='tar "-Izstd -19 -T0" -cf'

which allows quickly taring data into a compressed archive via archive myarchive.tar.zst file1 file2 file3 ….

Case study: filesystems

Another usecase for compression is filesystem compression. Conceptually similar to what borg does, files are transparently compressing and decompressed when written to or read from disk.

Among the filesystems capable of such inline-compression are ZFS and btrfs. btrfs supports ZLIB (gzip), LZO (lzop) and zstd, whereas ZFS supports lz4, LZJB (which was not included in these benchmarks as no appropriate binary compressor was found), gzip and ZLE (zero-length-encoding, only compressing zeroes, hence also not tested). zstd support for OpenZFS has been merged, but apparently hasn’t made it into any stable version yet at time of writing according to the OpenZFS documentation.

This case is situated similar to the tar-case study, and as all compressors available for ZFS and btrfs have already been covered in the section above, there is no reason to reiterate these results here.

It shall, however, be noted, that at least for btrfs, the standard flag for filesystem compression adopts a similar heuristic as borg and hence the case of incompressible data might not be so relevant for a btrfs installation. That being said, the conclusion here is a recommendation of zstd, and as we have seen in the last section, the question of incompressible files doesn’t change the overall recommendation.

Real world usage recommendation

If you want to save diskspace by using compression, the mount option compress in combination with zstd is generally a good choice for btrfs. This also includes the compressibility-heuristics (compress-force would be the option that compresses without this heuristics). For ZFS, the general recommendation is consequentially also zstd once it makes its way into a release.

Conclusion

Concluding the experiments laid out in this blogpost, we can effectively state an almost unconditional and surprisingly clear recommendation to simply use zstd for everything. The exact level might depend on the usecase, but overall it has demonstrated to be the most versatile, yet effective compressor around when it comes to effectiveness and speed, both for compression and especially decompression. Furthermore, it has, in contrast to most other contenders, flags for built-in parallelization, which not used in this blogpost at all, and yet zstd still stomped almost the entire competition.

Only if resulting filesize should be pushed down as much as possible, without any regard for computational cost, lzma retains an edge for most kinds of data. In practice, however, the conclusion is to simply use zstd.

Thanks to Joru and corvus for proofreading and helpful comments.


  1. Which is a easy though little bit cheated way of asking “how much CPU-time do I have to burn on this?”. Obviously, there are plenty of other criteria that might be relevant, depending on the particular usecase, such as memory consumption. Furthermore, all these criteria also apply for decompression as well as compression, as we will investigate later, technically doubling the amount of criteria we can take into consideration. ↩︎

  2. For algorithms that allow parallelized compressions, this might no longer necessarily be the case, but all data in this blogpost was gathered with non-parallelized compression for all tested algorithms. ↩︎

  3. Fun fact on the site: the entire benchmarking suite (including some more data that is not included in this blogpost) runs 61 days straight on the i5-3320M. Fortunately it’s a bit faster on newer CPUs. :D ↩︎

  4. Furthermore, mkinitcpio runs zstd with -T0 by default, which parallelizes compression to all available cores. This accellerates compression even further, but was not tested in this particular scenario and hence not included in the plot, as most compressors do not support parallelization. But even without parallelization, zstd still makes it to the pareto frontier. There might be another blogpost upcoming to take a look at parallelization at some point, though… ↩︎

by Jonas Große Sundrup at 2021-03-02 22:39

2021-02-26

michael-herbst.com

Gator: a Python-driven program for spectroscopy simulations

A bit over a year ago we published our adcc code. In this work the aim was to develop a toolkit for computational spectroscopy methods focused on rapid development and interactive hands-on usage (see the blog article for details). Our target back then was to simplify method development involving the algebraic-diagrammatic construction approach (ADC) to compute excited states energies and properties. ADC has been a research focus both of myself as well as the group of Andreas Dreuw and the ADC family of methods have proven in the past to be greatly suited for describing photochemistry and spectroscopic results.

Employing mainly thread-based parallelism and (apart from our recent inclusion of libxm) basically no options for swapping stored tensors to disk, adcc is naturally restricted to problems that fit into the main memory of a single cluster node. This is fine for developing and testing new ADC methods, but can be limiting for employing ADC methods in practice: The code can currently only treat small-sized to medium-sized molecules.

In parallel to adcc we therefore started working on the Gator project in collaboration with the groups of Patrick Norman and Zilvinas Rinkevicius (both KTH Stockholm). We now release in a first version. Apart from an interface to adcc, Gator features a response library capable of the complex polarisation propagator (CPP) approach for simulating properties such as excited-states polarisabilities or enabling a direct computation of spectra including broadening. Additionally it contains a newly developed ADC(2) module with MPI-based distributed computing capabilities. For this the integral driver of the Veloxchem code from KTH is used, which allows the ADC(2) computation to be performed in a direct fashion (i.e. without storing the two-electron-integral tensor). This makes ADC(2) simulations in Gator more memory efficient and allows them to be distributed over a few cluster nodes. In this publication we provide an overview of Gator's current capabilities. The full abstract reads

The Gator program has been developed for computational spectroscopy and calculations of molecular properties using real and complex propagators at the correlated level of wave function theory. At present, the focus lies on methods based on the algebraic diagrammatic construction (ADC) scheme up to third-order of perturbation theory. A Fock matrix-driven implementation of the second-order ADC method for excitation energies has been realized with an underlying hybrid MPI/OpenMP parallelization scheme suitable for execution in high-performance computing cluster environments. With a modular and object-oriented program structure written in a Python/C++ layered fashion, Gator enables, in addition, time-efficient prototyping of novel scientific approaches as well as interactive notebook-driven training of students in quantum chemistry.

by Michael F. Herbst at 2021-02-26 23:30 under electronic structure theory, theoretical chemistry, adcc, algebraic-diagrammatic construction

2021-02-13

Insanity Industries

Tracking leftover packages with pacman

Automatically resolving and installing dependencies is one of the core features of package managers (and one of the most convenient). However, this can lead to packages being installed that have been pulled as a dependency for another package, but are no longer needed1. This can have slightly unfortunate side effects, such as crowding the upgrade dialog when a bunch of packages that are no longer needed receive updates in high frequency (looking at you, Haskell packages in Arch…).

To ease house cleaning, at least on distros using pacman, we can simply find these unneeded packages that used to be installed as a dependency via pacman -Qtd2. This will list all unrequired packages that have been originally installed as a dependency instead of being specifically installed by the user.

We could either periodically run this by hand, or we can simply tell pacman to just do it for us regularly. To do so, we create a new file /etc/pacman.d/hooks/unneeded-packages.hook3 containing

[Trigger]
Operation = Install
Operation = Upgrade
Operation = Remove
Type = Package
Target = *

[Action]
Description = "Checking for unneeded packages"
When = PostTransaction
Exec = /usr/bin/pacman -Qtd

This will run pacman -Qtd on every pacman-operation that can change our package state for every possible target package after all package transactions have been completed.

While this works, the output is not the most neatly arranged, so we can just modify our Exec a little to make it more beautiful, either by calling a script that does that or simply by a semi-beautiful, but compact one-liner, changing our pacman-hook to

[Trigger]
Operation = Install
Operation = Upgrade
Operation = Remove
Type = Package
Target = *

[Action]
Description = "Checking for unneeded packages"
When = PostTransaction
Exec = /usr/bin/bash -c "set -o pipefail && /usr/bin/pacman -Qtd | sed 's/^/  - /'  || /usr/bin/echo '  :: No unneeded packages found.'"

This now neatly displays superfluous packages right after ever pacman -Syu, so you can clean up right away if you really don’t need those packages anymore.

Cleaning up superfluous packages

The quickest way of removing superfluous packages (and all its dependencies, that are only needed by these packages) is by running pacman -Rnsc $(pacman -Qdtq)2, which can be easily aliased to something like pacclean, being readily available at your fingertips afterwards without much typing.

off-topic remark: dependency tracking

On the note of package dependencies: if you want to find out which package(s) pulled in a specific package, pactree -r specific_package_name will give you insight into that.


  1. This could happen for numerous reasons, for example because one removed the package needing it without telling the package manager to remove dependencies as well or that package simply dropped its dependency. ↩︎

  2. Finding out what the single flags mean is left to the reader, if not known, the author highly recommends getting to know pacman’s tremendously helpful subcommand-helps, such as pacman -Qh. ↩︎

  3. The name of the file doesn’t matter, however, it must be located in the hooks-directory and end in .hook for pacman to read it. ↩︎

by Jonas Große Sundrup at 2021-02-13 00:00

2021-02-04

michael-herbst.com

CESMIX TST meeting: DFTK.jl: A multidisciplinary Julia code for density-functional theory development

These past two days I have participated in the Tri-Lab Support Team (TST) meeting of the CESMIX, the newly founded Center for the Exascale Simulation of Material Interfaces in Extreme Environments at the Massachusetts Institute of Technology. Within the next few years the idea of the CESMIX is to develop a multi-level simulation stack all the way up from DFT over MD to flow simulations to be able to discover novel materials suitable for extremely high temperatures under atmospheric conditions. The prototypical application for such materials would be heat shields for example in space crafts returning to earth or supersonic planes.

One novel aspect of the project is to include progress from modern compiler techniques and programming language design when building the software stack. In particular the challenge is that multiple codes will be involved in the project that feature a large variety of programming languages (FORTRAN, C++, Julia, python, ...). On top of that one goal is to keep track of simulation errors using uncertainty quantification (UQ) and use that insight to construct a multi-fidelity workflow. In such an approach the data generation for the simulation does not only employ a single accurate model, but in fact features multiple simulation layers based on cheaper and cruder models as well as more costly and accurate ones. Using the deduced knowledge of the error one can dynamically switch between these models and reach a compromise between accuracy and computational cost, but in a way that the result still has the quality to be comparable to experiments. As the employed fidelity layers the CESMIX project targets classical molecular dynamics or ab initio molecular dynamics with various kinds of density-functional theory (DFT) methods ... and this is the angle of my involvement in the project.

In particular using our density-functional toolkit (DFTK) the idea is to be able to quickly prototype parts of the workflow. Then, building on DFTK's design as a multi-disciplinary platform (see my related blog articles), we want to start incorporating new techniques (GPU platforms, UQ, multi-fidelity) to see how they could fit. I am excited about the opportunity to contribute to a project, which shares a lot in philosophy to DFTK itself. In particular I am looking forward to seeing how DFTK will play out in a real-world research scenario for connecting the needs from the modelling side with the approaches of the computer science folks.

With respect to the TST meeting, I briefly gave an overview about DFTK to show where we are. Slides and a short demo are attached below.

Link Licence
DFTK.jl: A multidisciplinary Julia code for density-functional theory development (Slides) Creative Commons License
Benchmarks and DFTK demo (Tarball) Creative Commons License

by Michael F. Herbst at 2021-02-04 19:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, numerical analysis, Kohn-Sham, high-throughput, invited talk, DFT, solid state

2021-01-10

sECuREs website

A quick introduction to MQTT for IOT

While I had heard the abbreviation MQTT many times, I never had a closer look at what MQTT is.

Here are a few quick notes about using MQTT as Pub/Sub bus in a home IOT network.

Motivation

Once you have a few IOT devices, an obvious question is how to network them.

If all your devices are from the same vendor, the vendor takes care of it.

In my home, I have many different vendors/devices, such as (incomplete list):

Here is how I combine these devices:

  • When I’m close to my home (geo-fencing), the Nuki Opener enables Ring To Open (RTO): when I ring the door bell, it opens the door for me.
  • When I open the apartment door, the Smart Lights in the hallway turn on.
  • When I’m home, my stereo speakers should be powered on so I can play music.

A conceptually simple way to hook this up is to connect things directly: listen to the Aqara Door Sensor and instruct the Smart Lights to turn on, for example.

But, connecting everything to an MQTT bus has a couple of advantages:

  1. Unification: everything is visible in one place, the same tools work for all devices.
  2. Your custom logic is uncoupled from vendor details: you can receive and send MQTT.
  3. Compatibility with existing software, such as Home Assistant or openHAB

Step 1. Set up an MQTT broker (server)

A broker is what relays messages between publishers and subscribers. As an optimization, the most recent value of a topic can be retained, so that e.g. a subscriber does not need to wait for the next change to obtain the current state.

The most popular choice for broker software seems to be Mosquitto, but since I like to run Go software on https://gokrazy.org/, I kept looking and found https://github.com/fhmq/hmq.

One downside of hmq might be that it does not seem to support persisting retained messages to disk. I’ll treat this as a feature for the time being, enforcing a fresh start on every daily reboot.

To restrict hmq to only listen in my local network, I’m using gokrazy’s flag file feature:

mkdir -p flags/github.com/fhmq/hmq
echo --host=10.0.0.217 > flags/github.com/fhmq/hmq/flags.txt

Note that you’ll need https://github.com/fhmq/hmq/pull/105 in case your network does not come up quickly.

MQTT broker setup: displaying/sending test messages

To display all messages going through your MQTT broker, subscribe using the Mosquitto tools:

% sudo pacman -S mosquitto
% mosquitto_sub --id "${HOST}_all" --host dr.lan --topic '#' --verbose

The # sign denotes an MQTT wildcard, meaning subscribe to all topics in this case.

Be sure to set a unique id for each mosquitto_sub command you run, so that you can see which subscribers are connected to your MQTT bus. Avoid id clashes, otherwise the subscribers will disconnect each other!

Now, when you send a test message, you should see it:

% mosquitto_pub --host dr.lan --topic 'cmnd/tasmota_68462F/Power' -m 'ON'

Tip: If you have binary data on your MQTT bus, you can display it in hex with timestamps:

% mosquitto_sub \
  --id "${HOST}_bell" \
  --host dr.lan \
  --topic 'doorbell/#' \
  -F '@Y-@m-@dT@H:@M:@S@z : %t : %x'

Step 2. Integrate with MQTT

Now that communication via the bus works, what messages do we publish on which topics?

MQTT only defines that topics are hierarchical; messages are arbitrary byte sequences.

There are a few popular conventions for what to put onto MQTT:

If you design everything yourself, Homie seems like a good option. If you plan to use Home Assistant or similar, stick to the Home Assistant convention.

Best practices for your own structure

In case you want/need to define your own topics, keep these tips in mind:

  • devices publish their state on a single, retained topic
    • the topic name could be e.g. stat/tasmota_68462F/POWER
    • retaining the topic allows consumers to catch up after (re-)connecting to the bus
  • publish commands on a single, possibly-retained topic
    • e.g. publish ON to topic cmnd/tasmota_68462F/Power
    • publish the desired state: publish ON or OFF instead of TOGGLE
    • if you retain the topic and publish TOGGLE commands, your lights will mysteriously go off/on when they unexpectedly re-establish their MQTT connection

Integration: Shelly devices with MQTT built-in

Shelly has a number of smart devices that come with MQTT out of the box! This sounds like the easiest solution if you’re starting from scratch.

I haven’t used these devices personally, but I hear good things about them.

Integration: Zigbee2MQTT for Zigbee devices

Zigbee2MQTT supports well over 1000 Zigbee devices and exposes them on the MQTT bus.

For example, this is what you would use to connect your IKEA TRÅDFRI Smart Lights to MQTT.

Integration: ESPHome for micro controllers + sensors

The ESPHome system is a ready-made solution to connect a wide array of sensors and devices to your home network via MQTT.

If you want to use your own ESP-based micro controllers and sensors, this seems like the easiest way to get them programmed.

Integration: Mongoose OS for micro controllers

Mongoose OS is an IOT firmware development framework, taking care of device management, Over-The-Air updates, and more.

Mongoose comes with MQTT support, and with just a few lines you can build, flash and configure your device. Here’s an example for the NodeMCU (ESP8266-based):

% yay -S mos-bin
% mos clone https://github.com/mongoose-os-apps/demo-js app1
% cd app1
% mos --platform esp8266 build
% mos --platform esp8266 --port /dev/ttyUSB1 flash
% mos --port /dev/ttyUSB1 config-set mqtt.enable=true mqtt.server=dr.lan:1883

Pressing the button on the NodeMCU publishes a message to MQTT:

% mosquitto_sub --host dr.lan --topic devices/esp8266_F4B37C/events
{"ram_free":31260,"uptime":27.168680,"btnCount":2,"on":false}

Integration: Arduino for custom micro controller firmware

Arduino has an MQTT Client library. If your microcontroller is networked, e.g. an ESP32 with WiFi, you can publish MQTT messages from your Arduino sketch:

#include <WiFi.h>
#include <PubSubClient.h>

WiFiClient wificlient;
PubSubClient client(wificlient);

void callback(char* topic, byte* payload, unsigned int length) {
    Serial.print("Message arrived [");
    Serial.print(topic);
    Serial.print("] ");
    for (int i = 0; i < length; i++) {
      Serial.print((char)payload[i]);
    }
    Serial.println();
  
    if (strcmp(topic, "doorbell/cmd/unlock") == 0) {
  		// …
    }
}

void taskmqtt(void *pvParameters) {
	for (;;) {
		if (!client.connected()) {
			client.connect("doorbell" /* clientid */);
			client.subscribe("doorbell/cmd/unlock");
		}

		// Poll PubSubClient for new messages and invoke the callback.
		// Should be called as infrequent as one is willing to delay
		// reacting to MQTT messages.
		// Should not be called too frequently to avoid strain on
		// the network hardware:
		// https://github.com/knolleary/pubsubclient/issues/756#issuecomment-654335096
		client.loop();
		vTaskDelay(pdMS_TO_TICKS(100));
	}
}

void setup() {
	connectToWiFi(); // WiFi configuration omitted for brevity

	client.setServer("dr.lan", 1883);
	client.setCallback(callback);

	xTaskCreatePinnedToCore(taskmqtt, "MQTT", 2048, NULL, 1, NULL, PRO_CPU_NUM);
}

void processEvent(void *buf, int telegramLen) {
	client.publish("doorbell/events/scs", buf, telegramLen);
}

Integration: Webhook to MQTT

The Nuki Opener doesn’t support MQTT out of the box, but the Nuki Bridge can send Webhook requests. In a few lines of Go, you can forward what the Nuki Bridge sends to MQTT:

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"

	mqtt "github.com/eclipse/paho.mqtt.golang"
)

func nukiBridge() error {
	opts := mqtt.NewClientOptions().AddBroker("tcp://dr.lan:1883")
	opts.SetClientID("nuki2mqtt")
	opts.SetConnectRetry(true)
	mqttClient := mqtt.NewClient(opts)
	if token := mqttClient.Connect(); token.Wait() && token.Error() != nil {
		return fmt.Errorf("MQTT connection failed: %v", token.Error())
	}

	mux := http.NewServeMux()
	mux.HandleFunc("/nuki", func(w http.ResponseWriter, r *http.Request) {
		b, err := ioutil.ReadAll(r.Body)
		if err != nil {
			log.Print(err)
			http.Error(w, err.Error(), http.StatusInternalServerError)
			return
		}

		mqttClient.Publish(
			"zkj-nuki/webhook", // topic
			0, // qos
			true, // retained
			string(b)) // payload
	})

	return http.ListenAndServe(":8319", mux)
}

func main() {
	if err := nukiBridge(); err != nil {
		log.Fatal(err)
	}
}

See Nuki’s Bridge HTTP-API document for details on how to configure your bridge to send webhook callbacks.

Step 3. Express your logic

Home Assistant and Node-RED are both popular options, but also large software packages.

Personally, I find it more fun to express my logic directly in a full programming language (Go).

I call the resulting program regelwerk (“collection of rules”). The program consists of:

  1. various control loops that progress independently from each other
  2. an MQTT message dispatcher feeding these control loops
  3. a debugging web interface to visualize state

This architecture is by no means a new approach: as moquette describes it, this is to MQTT what inetd is to IP. I find moquette’s one-process-per-message model to be too heavyweight and clumsy to deploy to https://gokrazy.org, so regelwerk is entirely in-process and a single, easy-to-deploy binary, both to computers for notifications, or to headless Raspberry Pis.

regelwerk: control loops definition

regelwerk defines a control loop as a stateful function that accepts an event (from MQTT) and returns messages to publish to MQTT, if any:

type controlLoop interface {
	sync.Locker

	StatusString() string // for human introspection

	ProcessEvent(MQTTEvent) []MQTTPublish
}

// Like mqtt.Message, but with timestamp
type MQTTEvent struct {
	Timestamp time.Time
	Topic     string
	Payload   interface{}
}

// Parameters for mqtt.Client.Publish()
type MQTTPublish struct {
	Topic    string
	Qos      byte
	Retained bool
	Payload  interface{}
}

regelwerk: MQTT dispatcher

Our MQTT message handler dispatches each incoming message to all control loops, in one goroutine per message and loop. With typical message volumes on a personal MQTT bus, this is a simple yet effective design that brings just enough isolation.

type mqttMessageHandler struct {
	dryRun bool
	loops  []controlLoop
}

func (h *mqttMessageHandler) handle(client mqtt.Client, m mqtt.Message) {
	log.Printf("received message %q on %q", m.Payload(), m.Topic())
	ev := MQTTEvent{
		Timestamp: time.Now(), // consistent for all loops
		Topic:     m.Topic(),
		Payload:   m.Payload(),
	}

	for _, l := range h.loops {
		l := l // copy
		go func() {
			// For reliability, we call each loop in its own goroutine
			// (yes, one per message), so that when one loop gets stuck,
			// the others still make progress.
			l.Lock()
			results := l.ProcessEvent(ev)
			l.Unlock()
			if len(results) == 0 {
				return
			}
			for _, r := range results {
				log.Printf("publishing: %+v", r)
				if !h.dryRun {
					client.Publish(r.Topic, r.Qos, r.Retained, r.Payload)
				}
			}
			// …input/output logging omitted for brevity…
		}()
	}
}

regelwerk: control loop example

Now that we have the definition and dispatching out of the way, let’s take a look at an actual example control loop.

This control loops looks at whether my PC is unlocked (in use) or whether my phone is home, and then turns off/on my stereo speakers accordingly.

The inputs come from runstatus and dhcp4d, the output goes to a Sonoff S26 Smart Power Plug running Tasmota.

type avrPowerLoop struct {
	statusLoop // for l.statusf() debugging

	midnaUnlocked          bool
	michaelPhoneExpiration time.Time
}

func (l *avrPowerLoop) ProcessEvent(ev MQTTEvent) []MQTTPublish {
	// Update loop state based on inputs:
	switch ev.Topic {
	case "runstatus/midna/i3lock":
		var status struct {
			Running bool `json:"running"`
		}
		if err := json.Unmarshal(ev.Payload.([]byte), &status); err != nil {
			l.statusf("unmarshaling runstatus: %v", err)
			return nil
		}
		l.midnaUnlocked = !status.Running

	case "router7/dhcp4d/lease/Michaels-iPhone":
		var lease struct {
			Expiration time.Time `json:"expiration"`
		}
		if err := json.Unmarshal(ev.Payload.([]byte), &lease); err != nil {
			l.statusf("unmarshaling router7 lease: %v", err)
			return nil
		}
		l.michaelPhoneExpiration = lease.Expiration

	default:
		return nil // event did not influence our state
	}

	// Publish desired state changes:
	now := ev.Timestamp
	phoneHome := l.michaelPhoneExpiration.After(now)
	anyoneHome := l.midnaUnlocked || (now.Hour() > 8 && phoneHome)
	l.statusf("midnaUnlocked=%v || (now.Hour=%v > 8 && phoneHome=%v)",
		l.midnaUnlocked, now.Hour(), phoneHome)

	payload := "OFF"
	if anyoneHome {
		payload = "ON"
	}
	return []MQTTPublish{
		{
			Topic:    "cmnd/tasmota_68462F/Power",
			Payload:  payload,
			Retained: true,
		},
	}
}

Conclusion

I like the Pub/Sub pattern for home automation, as it nicely uncouples all components.

It’s a shame that standards such as The Homie convention aren’t more widely supported, but it looks like software makes up for that via configuration options.

There are plenty of existing integrations that should cover most needs.

Ideally, more Smart Home and IOT vendors would add MQTT support out of the box, like Shelly.

at 2021-01-10 14:26

2020-12-26

atsutane

polkit und systemd System Units

Was in diesem Post als System Unit bezeichnet wird, ist eine systemd Unit deren Prozesse im system.slice laufen, was den meisten Service Units entspricht. Unter systemd haben Benutzer standardmäßig nicht die Rechte den Status solcher Units via systemd selbst zu beeinflussen. Es ist jedoch möglich Benutzern oder Gruppen für bestimmte Befehle dies mittels polkit zu erlauben.

systemd prüft bei der Ausführung der meisten Befehle mittels polkit, ob der den Befehl initierende Benutzer dies darf, bei Befehlen wie systemctl status oder systemctl show geschieht dies nicht. Für die Befehle zum Start, Stop oder Neustart einer Unit kann man diese Befugnis sehr spezifisch erteilen, bei Befehlen die über eine andere Policy gehandhabt werden ist eine spezifische Einschränkung nicht möglich, zum Beispiel das De-/Aktivieren von Units. Bei systemctl start|stop|restart wird org.freedesktop.systemd1.manage-units verwendet, für systemctl enable|disable ist es org.freedesktop.systemd1.manage-unit-files, auch wird im systemd Code letzterer beider Befehle nicht nach so vielen spezifischen Details gefragt wie bei den ersten drei.

Ein beispielhafter Anwendungsfall für solche Regeln wäre es, wenn eine Benutzergruppe auf einem Entwicklungsserver in keiner Form als root operieren darf. Den Entwicklern soll es jedoch möglich sein, die von ihnen entwickelte Software auf den Server zu deployen und exakt in der Form auszuführen, in der die Software später in einer produktiven Landschaft operiert. Hierzu bedürfe es also einer globalen Service Unit und einer polkit Regel, die es den Benutzern erlaubt den Service via systemctl start|stop|restart zu handhaben. Ob dies nun menschliche Benutzer oder der Benutzer einer continuous integration Lösung ist, der die Ergebnisse des Builds deployed sei dahingestellt, auf magische Art und Weise geschieht das Deployment natürlich auch immer absolut korrekt.

Die Service Unit foobar.service

[Unit]
Description=foobar is the most creative name for anything in the whole universe.

[Service]
Type=simple
ExecStart=/opt/foobar/executable
User=foobar
Group=foobar
# Imagine lots of other useful settings here

[Install]
WantedBy=multi-user.target

Die dazugehörige polkit Regel /etc/polkit-1/10-foobar.rules

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.systemd1.manage-units" &&
        (subject.isInGroup("developer") && action.lookup("unit") == "foobar.service" &&
        (action.lookup("verb") == "start" ||
         action.lookup("verb") == "stop"  ||
         action.lookup("verb") == "restart"))) {
        return polkit.Result.YES;
    }
});

Der für granularere Restriktion ist neben der Gruppen auch der Benutzername als subject.user verfügbar.

by Thorsten Töpper at 2020-12-26 15:07 under howto, systemd, polkit

2020-12-23

michael-herbst.com

High-throughput density-functional theory calculations: An interdisciplinary challenge

Last Thursday I was invited to give a virtual talk at the Scientific Computing Seminar of working group of Prof. Nicolas Gauger at TU Kaiserslautern. Since the research in Prof. Gauger's group mostly concerns topics which are not directly related to electronic structure theory and density-functional theory (DFT), I chose to present my current research from a rather broad and introductory angle. Main focus of my talk was thus to hint at the interdisciplinary challenges arising in high-throughput methods in DFT simulations, followed by a summary of a few of my recent projects in the field.

I was glad for the opportunity to spread the word about the difficulties with high-throughput methods in DFT. Firstly because I think it is an absolutely fascinating topic, but secondly because it is one where input from fields beyond the standard "culprits" of chemistry, physics and materials science is beneficial to solve the upcoming problems. In fact exactly this hope to get other fields and people with non-standard backgrounds involved was one of the driving forces behind our density-functional toolkit DFT code, which I also briefly presented.

I hope that my talk got some of the audience more interested in DFT and I look forward to continue the discussions at a later point, hopefully meeting the Gauger group in person in Kaiserslautern.

Link Licence
High-throughput density-functional theory calculations: An interdisciplinary challenge (Slides) Creative Commons License

by Michael F. Herbst at 2020-12-23 17:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, numerical analysis, Kohn-Sham, high-throughput, invited talk, DFT, solid state

2020-11-30

sECuREs website

Fixing the Nuki Opener smart intercom IOT device (on the BTicino SCS bus intercom system)

I recently bought a Nuki Opener, which “turns your existing intercom into a smart door opener”.

Unfortunately, I have had a lot of trouble getting it to work.

I finally got the device working by interjecting my own micro controller between the intercom bus and the Nuki Opener, then driving the Nuki Opener in its Analogue mode:

The rest of this article outlines how this setup works at a high level.

Prerequisites

For reliable interpretation and transmission of SCS bus data, we’ll need:

  1. SCS receive/transmit circuits. These can be prototyped on a breadboard if you have the required diodes, transistors, resistors and capacitors.

  2. A microcontroller with an Analog Comparator. If your microcontroller has one, you’ll find a corresponding section in the datasheet. This function is sometimes abbreviated to CMP or AC, or might be part of a larger Analog/Digital Converter (ADC).

  3. A UART (serial) decoder. Most microcontrollers have at least one UART, but if you don’t have one available for whichever reason, you could use a software UART implementation, too.

SCS receive circuit

SCS receive circuit

An R-C network, directly connected to the SCS bus, is used for incoming signal conditioning.

The resistor values have been chosen to divide the voltage of the input signal from 28V down to approx. 2V, i.e. well within the 0-3.3V range for modern microcontroller GPIO pins.

A zener diode limits the 28V level to 3.3V, which should be safe for most microcontrollers.

Simulation: https://tinyurl.com/yxhrkejn


SCS transmit circuit

SCS transmit circuit

We directly connect the gate of a mosfet transistor to a GPIO pin of our microcontroller, so that when the microcontroller drives the pin high, we use the 100Ω resistor to attach a load to the SCS bus.

For comparison, the KNX bus, which is similar to the SCS bus, uses a 68Ω resistor here.

Simulation: https://tinyurl.com/y6nv4yg7


SCS lab setup

Use a lab power supply to generate 28V DC. I’m using the Velleman LABPS3005SM because it was in stock at Galaxus, but any power supply rated for at least 30V DC will do.

As the DIY home automation blog entry “A minimal KNX setup” describes, you’ll need to place a 47Ω resistor between the power line and your components.

Afterwards, just connect your components to the bus. The supply/ground line of a breadboard will work nicely.

SCS lab setup

Micro Controller choice

In this blog post, I’m using a Teensy 4 development board that is widely available for ≈20 USD:

Teensy 4

With its 600 MHz, the Teensy 4 has enough sheer clock frequency to allow for sloppier coding while still achieving high quality input/output.

The teensy tiny form factor (3.5 x 1.7 cm) works well for this project and will allow me to store the microcontroller in an existing intercom case.

The biggest downside is that NXP’s own MCUXpresso IDE cannot target the Teensy 4!

The only officially supported development environment for the Teensy 4 is Teensyduino, which is a board support package for the Arduino IDE. Having Arduino support is great, but let’s compare:

I also have NXP’s MIMXRT1060-EVK eval kit, which uses the same i.MX RT1060 micro controller family as the Teensy 4, but is much larger and comes with all the bells and whistles; notably:

  1. The MCUXpresso IDE works with the eval kit’s built-in debugger out of the box! Being able to inspect a stack trace, set breakpoints and look at register contents are invaluable tools when doing micro controller development.
  2. The MCUXpresso IDE comes with convenient graphical Pin and Clock config tools. Setting a pin’s alternate function becomes a few clicks instead of hours of fumbling around.
  3. The NXP SDK contains a number of drivers and examples that are tested on the eval kit. That makes it really easy to get started!

Each of these points is very attractive on their own, but together they make the whole experience so different!

Being able to deploy to the Teensy from MCUXpresso would be a killer feature! So many NXP SDK examples would suddenly become available, filling the Teensy community’s gaps.

Signal Setup

On a high level, this is how we are going to connect the various signals:

Step 1. We start with the SCS intercom bus signal (28V high, 22V low):

Step 2. Our SCS receive circuit takes the bus signal and divides it down to 2V:

voltage-divided SCS signal

Step 3. We convert the voltage-divided analog signal into a digital SCSRXOUT signal:

Analog Comparator output signal

Step 4. We modify our SCSRXOUT signal so that it can be sampled at 50%:

modified SCS signal

Step 5. We decode the signal using our micro controller’s UART:

Teensy 4 UART decodes SCS

Micro Controller firmware

Once I complete the next revision of the SCS interface PCB, I plan to release all design files, schematics, sources, etc. in full.

Until then, the following sections describe how the most important parts work, but skip over the implementation-specific glue code that wires everything together.

Analog Comparator

The Analog Comparator in our microcontroller lets us know whether a voltage is above or below a configured threshold voltage by raising an interrupt. A good threshold is 1.65V in my case.

In response to the voltage change, we set GPIO pin 15 to a digital high (3.3V) or low (0V) level:

volatile uint32_t cmpflags;

// ISR (Interrupt Service Routine), called by the Analog Comparator:
void acmp1_isr() {
  cmpflags = CMP1_SCR;

  { // clear interrupt status flags:
    uint8_t scr = (CMP1_SCR & ~(CMP_SCR_CFR_MASK | CMP_SCR_CFF_MASK));
    CMP1_SCR = scr | CMP_SCR_CFR_MASK | CMP_SCR_CFF_MASK;
  }

  if (cmpflags & CMP_SCR_CFR_MASK) {
    // See below! This line will be modified:
    digitalWrite(15, HIGH);
  }

  if (cmpflags & CMP_SCR_CFF_MASK) {
    digitalWrite(15, LOW);
  }
}

This signal can easily be verified by attaching an oscilloscope probe each to the SCSRX voltage-regulated bus signal input and to the SCSRXOUT GPIO pin output:

Analog Comparator output signal

Analog Comparator Modification

There is one crucial difference between SCS and UART:

To transmit a 0 (or start bit):

  • SCS is low 34μs, then high 70μs
  • UART is low the entire 104μs

UART implementations typically sample at 50%, the middle of the bit period.

For SCS, we would need to sample at 20%, because the signal returns to high so quickly.

While setting a custom sample point is possible in e.g. sigrok’s UART decoder, neither software nor hardware serial implementations on micro controllers typically support it.

On a micro controller it is much easier to just modify the signal so that it can be sampled at 50%.

In practical terms, this means modifying the acmp1_isr function to return to high later than the Analog Comparator indicates:

volatile uint32_t cmpflags;

// ISR (Interrupt Service Routine), called by the Analog Comparator:
void acmp1_isr() {
  cmpflags = CMP1_SCR;

  { // clear interrupt status flags:
    uint8_t scr = (CMP1_SCR & ~(CMP_SCR_CFR_MASK | CMP_SCR_CFF_MASK));
    CMP1_SCR = scr | CMP_SCR_CFR_MASK | CMP_SCR_CFF_MASK;
  }

  if (cmpflags & CMP_SCR_CFR_MASK) {
    // Instead of setting our output pin high immediately,
    // we delay going up by approx. 40us,
    // turning the SCS signal into a UART signal:
    delayMicroseconds(40);
    digitalWrite(15, HIGH);
  }

  if (cmpflags & CMP_SCR_CFF_MASK) {
    digitalWrite(15, LOW);
  }
}

You can now read this signal using your laptop and a USB-to-serial adapter!

On a micro controller, we now feed this signal back into a UART decoder. For prototyping, this can literally mean a jumper wire connecting the output GPIO pin with a serial RX pin. Some micro controllers also support internal wiring of peripherals, allowing you to get rid of that cable.

SCS RX (receive)

With the SCS intercom bus signal bytes now available through the UART decoder, we can design a streaming SCS decoder. The decoder self-synchronizes and skips invalid SCS telegrams by checking their checksum. We start with a ring buffer and a convenience working copy:

constexpr int telegramLen = 7;

typedef struct {
  // circular buffer for incoming bytes, indexed using cur
  uint8_t buf[telegramLen];
  int cur;

  uint8_t tbuf[telegramLen];
} scsfilter;

Each byte we receive from the UART, we store in our ring buffer:

void sf_WriteByte(scsfilter *sf, uint8_t b) {
  sf->buf[sf->cur] = b;
  sf->cur = (sf->cur + 1) % telegramLen;
}

After every byte, we can check if the ring buffer decodes to a valid ring signal SCS bus telegram:

bool sf_completeAndValid(scsfilter *sf) {
  const uint8_t prev = sf->buf[(sf->cur+(telegramLen-1))%telegramLen];
  if (prev != 0xa3) {
    return false; // incomplete: previous byte not a telegram termination
  }

  // Copy the whole telegram into tbuf; makes working with it easier:
  for (int i = 0; i < telegramLen; i++) {
    sf->tbuf[i] = sf->buf[(sf->cur+i)%telegramLen];
  }

  const uint8_t stored = sf->tbuf[5];
  const uint8_t computed = sf->tbuf[1] ^
    sf->tbuf[2] ^
	sf->tbuf[3] ^
	sf->tbuf[4];
  if (stored != computed) {
    return false; // corrupt? checksum mismatch
  }

  return true;
}

int sf_ringForApartment(scsfilter *sf) {
  if (!sf_completeAndValid(sf)) {
    return -1;
  }

  if (sf->tbuf[3] != 0x60) {
    return -1; // not a ring command
  }

  if (sf->tbuf[1] != 0x91) {
    return -1; // not sent by the intercom house station
  }

  return (int)(sf->tbuf[2]); // apartment id
}

SCS TX (send)

Conceptually, writing serial data to a GPIO output from software is done with e.g. the Arduino SoftwareSerial library, but there are plenty of implementations for different micro controllers. This technique is also sometimes called “Bit banging”.

I started with the the Teensy SoftwareSerial::write implementation and modified it to:

  1. Invert the output to drive the SCS transmit circuit’s Mosfet transistor gate, i.e. low on idle and high on transmitting a 0 bit.

  2. Return to idle 70μs earlier than the signal would, i.e. after ≈34μs already.

The modified write function looks like this:

#define V27 LOW
#define V22 HIGH

#define scs0() do { \
  while (ARM_DWT_CYCCNT - begin_cycle < (target-43750/*70us*/)) ; \
  digitalWriteFast(11, V27); \
} while (0)

size_t SCSSerial::write(uint8_t b)
{
  elapsedMicros elapsed;
  uint32_t target;
  uint8_t mask;
  uint32_t begin_cycle;

  ARM_DEMCR |= ARM_DEMCR_TRCENA;
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
  ARM_DWT_CYCCNT = 0;

  // start bit
  target = cycles_per_bit;
  noInterrupts();
  begin_cycle = ARM_DWT_CYCCNT;
  digitalWriteFast(11, V22);
  scs0();
  wait_for_target(begin_cycle, target);

  // 8 data bits
  for (mask = 1; mask; mask <<= 1) {
    if (b&mask) {
      digitalWriteFast(11, V27);
    } else {
      digitalWriteFast(11, V22);
    }
    target += cycles_per_bit;
    scs0();
    wait_for_target(begin_cycle, target);
  }

  // stop bit
  digitalWriteFast(11, V27);
  interrupts();
  target += cycles_per_bit;
  scs0();
  while (ARM_DWT_CYCCNT - begin_cycle < target) ; // wait
  return 1;
}

It works!

With the approach described above, I now have a micro controller that recognizes doorbell rings for my apartment and ignores doorbell rings for my neighbors. The micro controller can unlock the door, too, and both features are available through the Nuki Opener.

How is the Nuki Opener?

It took over 2 months before I saw the Nuki Opener working correctly for the first time.

I really hope the Nuki developers can work with what I described above and improve their product’s reliability for all customers with an SCS intercom system!

The device itself seems useful and usable, but time will tell how reliable it turns out in practice. I think I noticed push notifications when the door rang coming in rather late (many seconds later).

I’ll keep an eye on this and explore the various Nuki APIs more.

Appendix: Project Journal

  • 2020-09-26: I buy a Nuki Opener (Nuki Opener #1), but despite connecting it correctly, it never successfully opens the door. I start learning about the SCS home automation bus system that our intercom uses.
  • 2020-09-28: I publish an SCS bus decoder for sigrok and contact the Nuki Support.
  • 2020-10-15: I buy another Nuki Opener (Nuki Opener #2) to test their old firmware version, because downgrading firmware versions is impossible. Opener #2 actually opens the door, so I assume we are dealing with a firmware problem [turns out incorrect later].
  • 2020-10-16: I publish a detailed analysis of the Nuki Opener not sending the correct signal for the Nuki developers to go through.
  • 2020-11-03: I update my new Nuki Opener #2 to the latest firmware and realize that my old Nuki Opener #1 most likely just has some sort of hardware defect. However, Opener #2 has trouble detecting the ring signal: either it doesn’t detect any rings at all, or it detects all rings, including those for my neighbors!
  • 2020-11-16: In their 13th (!) email reply, Nuki Support confirms that the Opener firmware is capturing and matching the incoming ring signal, if I understand their developers correctly.
  • 2020-11-18: I suggest to Nuki developers (via Nuki Support) to decode the SCS signal with a UART decoder instead of comparing waveforms. This should be a lot more reliable!
  • 2020-11-23: My self-designed SCS receiver/transmitter/power supply PCB arrives. The schematics are based on existing SCS DIY work, but I created my own KiCad files because I was only interested in the SCS bus interface, not the PIC microcontroller they used.
  • 2020-11-25: Working on the intercom, I assume some wire touched an unlucky spot, and my BTicino intercom went up in smoke. We enabled the Nuki Opener’s ring sound and started using it as our main door bell. This meant we now started hearing the ring sound for (some) of our neighbors as well.
  • 2020-11-26: My Teensy 4 microcontroller successfully decodes the SCS bus signal with its Analog Comparator and UART decoder.
  • 2020-11-28: My Teensy 4 microcontroller is deployed to filter the SCS bus ring signal and drive the Nuki Opener in analogue mode.

at 2020-11-30 07:12

2020-11-14

RaumZeitLabor

Hurra, wir (k)leben weiter! Aktion zur Remote Chaos Experience (rc3)

Auch wenn ohne Mateflaschenumfallgeklirr, Einlassbändchenkratzrandertastung, Bratnudelstandfettdampfklamottengeruch und Messeglaswurstecho sicher einiges an Congressfeeling fehlen wird – (Aufkleber-)Goldschatzauffindungsimpressionen sollen trotzdem nicht zu kurz kommen.

Hierfür haben wir die Aktion „Schicke Sticker von den Sticker-Schickern“ ins Leben gerufen.

Wenn ihr Lust auf eine klebrige Überraschung habt, könnt ihr uns bis zum 4. Dezember 2020 einen adressierten und frankierten Rückumschlag (am besten C5) an unsere neue Adresse senden und wir schicken euch Aufcccleber und andere Goodies zurück.

Die Aktion wird – außer der Umschläge – kostenlos sein. Dennoch freuen wir uns natürlich über allgemeine Spenden auf unser Konto oder einen Beitrag via PayPal, um alle Kosten zu decken und den möglichen Rest in unseren neuen Space zu investieren.

Bei Fragen könnt ihr euch auf Twitter und Mastodon an uns wenden.

at 2020-11-14 00:00

2020-11-13

RaumZeitLabor

Achievement unlocked, Umzug geglückt

Einen Tag vor Ablauf unserer Umzugsfrist haben wir die allerletzten Reste aus den alten Räumlichkeiten rausgekehrt und die Schlüssel unserem ehemaligen Vermieter übergeben. Die Ära Boveristraße ist somit endgültig Geschichte.

Unser Elektroniklabor, die Holzwerkstatt, das FabLab und unsere Küche haben in der Weinheimer Straße ein neues Zuhause gefunden haben. Auch unser Olymp wird in anderer Form bei den Heidelberger Breidenbach Studios weiterleben.

Wir freuen uns auf die nächsten 10(0) Jahre RaumZeitLabor! Auf abgeschlossene und verworfene Ideen und Projekte, Vorträge, Feiern, Ess- und Mottopartys, Nerd am Herd, Bits & Bites und Stick-Nachmittage. Auf 3D-Druck- und Lasercut-Sessions, Programmier-Workshops, Agenda Aktion, GnoPN, Large Hackerspace Conventions, Film- und Analogspieleabende, Mario-Kart-Turniere, GameJams, Exkursionen und Museumsbesuche, Ausflüge zu Veranstaltungen und vor allem auf ganz, ganz viel Popcorn!

by flederrattie at 2020-11-13 00:00

2020-10-21

michael-herbst.com

Challenges and prospects of a posteriori error estimation in density-functional theory

Last week Wednesday I was invited to give a talk at the group seminar of the AG Christoph Jacob at TU Braunschweig, Germany. Christoph was especially interested in our recent publication on a posteriori error estimation in Kohn-Sham problems and so I decided to use the opportunity to give a broad introduction into the topic. Since the main audience for my talk were chemists I motivated our work from the context of high-throughput density-functional theory calculations, which are becoming more and more of interest in practice. In the second part of my talk I tried to lay out the basic ideas and challenges of a posteriori error estimation and what are the key aspects to understand and overcome if one wants to provide a useful error bound for a particular problem (see also this article for more details). In the last part of the talk I turned the attention specifically to our recent contribution, discussing error estimates in the context of simple Kohn-Sham problems. Many of the delicate details of our work I did not touch upon, but nevertheless this last section of my talk gives a very good overview of our approach, our main underlying assumptions and our results. While we have focused solely on plane-wave basis sets in our work, I briefly hinted whether extensions to other basis sets with similar methods can be achieved.

Unfortunately the whole "visit" to Braunschweig was completely virtual, which really was a shame as I would have loved to return to the city and spend a day with the group. Despite these circumstances though the discussions with Christoph and his group did not not really fall short. During the day I had many video call sessions where I had ample opportunity to discuss with PhD students about their projects and in this way get a good idea of the interesting research going on in Braunschweig. As that's typically my favourite part about visiting a group I'm very happy this worked out so well (thanks to everyone who I managed to talk to!) and I already look forward to properly meeting everyone in person when the usual conference schedules are back in place.

The slides from my talk are attached below.

Link Licence
Challenges and prospects of a posteriori error estimation in density-functional theory (Slides) Creative Commons License

by Michael F. Herbst at 2020-10-21 16:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, error estimates, numerical analysis, Kohn-Sham, high-throughput, invited talk

2020-09-30

michael-herbst.com

Moansi: Inhomogeneous preconditioning for density-functional theory

Last week (24th and 25th September) I attended the 4th annual meeting of the Moansi (Modelling, analysis and simulation of Molecular Systems) work group of the GAMM. I already attended the Moansi meeting last year, where I very much enjoyed both the broad range of talks at the interdisciplinary border of maths, chemistry and physics as well as the familial atmosphere with lively and stimulating discussions. In the same spirit I was looking forward to this year's addition, where in light of the global pandemic, however, the workshop had to be virtual. Nevertheless I was able to take away many stimulating impressions from the two days of the meeting and I already look forward to next year, where we hopefully meet again in person.

During the meeting I myself presented our recently published work on an LDOS-based SCF preconditioner, which is especially suitable for DFT calculations in inhomogeneous systems (metallic slabs or clusters). See this blog article and my slides for details.

Link Licence
Black-box inhomogeneous preconditioning for density-functional theory (Slides) Creative Commons License

by Michael F. Herbst at 2020-09-30 20:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, SCF, high-throughput

2020-09-28

sECuREs website

Nuki Opener with an SCS bus intercom (bTicino 344212)

I have long been looking for a way to make my intercom a little more pleasant.

Recently, a friend made me aware of the Nuki Opener, which promises to make existing intercom systems smart, and claims to be compatible with the specific intercom I have!

So I got one and tried setting it up, but could not get it to work.

This post documents how I have analyzed what goes over the intercom’s SCS bus. Perhaps the technique is interesting, or perhaps you want to learn more about SCS :)

Note that I have not yet used the Nuki Opener, so I can’t say anything about it yet. What I have seen so far makes a good impression, but it just does not seem to work at all with my intercom. I will update this article after working with the Nuki support to fix this.

Connecting the Nuki Opener to the bTicino 344212

First, I identified which wires are used for the bus: between BUS- and BUS+, the internet tells me that I would expect to measure ≈27V, and indeed a multimeter shows:

BTicino multimeter

I then connected the Nuki Opener as described in “Connect the Nuki Opener to an unknown intercom”, Page 8, Bus intercoms → Basic setup without doorbell suppression:

Nuki wire Intercom Signal
black BUS- GND
red BUS+ SCS (+27V)
orange BUS+ SCS (+27V)
BTicino wiring

I had previously tried the enhanced setup with doorbell suppression, as the Nuki app recommends, but switched to the simplest setup possible when capturing the signal.

Configuring the Nuki Opener

With the Nuki app, I configured the Opener either as:

  • bTicino → 344212
  • Generic → Bus (SCS)
  • Unknown intercom

Unfortunately, with all configurations:

  1. The app says it learned the door open signal successfully.
  2. The device/app does react to door rings.
  3. The device never successfully opens the door.

Capturing the SCS bus with sigrok

The logic analyzer that I have at home only works with signals under 5V. As the SCS bus is running at 27V, I’m capturing the signal with my Hantek 6022BE USB oscilloscope.

sigrok is a portable, cross-platform, free open source signal analysis software suite and supports the Hantek 6022BE out of the box, provided you have at least version 0.1.4 of the the sigrok fx2lafw package installed.

Check out sigrok’s “Getting started with a logic analyzer” if you’re new to sigrok!

The Nuki Opener has 3 different pin headers you can use, depending on where you want to attach it on your wall. These are connected straight through, so I used them to conveniently grab BUS+ and BUS- just like the Nuki sees it:

BTicino capture

I set the oscilloscope probe head to its 10X divider setting, so that I had the full value range available, then started sampling 5M samples at 500 kHz:

sigrok PulseView screenshot

You can see 10s worth of signal. The three bursts are transmissions on the SCS bus.

The labeling didn’t quite match for me: it shows e.g. 3.2V instead of 27V, but as long as the signal comes in clearly, it doesn’t matter if it is offset or scaled.

SCS bus decoding with sigrok: voltage levels

Let’s tell sigrok what voltage level corresponds to a low or high signal:

  1. left-click on channel CH1
  2. set “conversion” to “to logic via threshold”
  3. set “conversion threshold” to 3.0V

Now you’ll see not only the captured signal, but also the logical signal below in green:

sigrok PulseView screenshot

SCS bus decoding with sigrok: SCS decoder

Now that we have obtained a logical/digital signal (low/high), we can write a sigrok decoder for the SCS bus. See sigrok’s Protocol decoder HOWTO for an introduction.

In general, I strongly recommend investing into tooling, in particular when decoding protocols. Spending a few minutes to an hour at this stage will minimize mistakes and save lots of time later, and—when you contribute your tooling—enable others to do more interesting work!

I found it easy to write a sigrok decoder, having never used their API before. It was quick to get something onto the screen, mistakes were easy to correct, and the whole process was nicely iterative.

Until it is merged and released with a new version of libsigrokdecode, you can find my SCS decoder on GitHub.

The decoder looks at every layer of an SCS telegram: the start/stop bits, the data bits, the value and the value’s logical position/function in the SCS telegram.

SCS full

Our SCS decoder displays the 3 bursts on the SCS bus when we ring the doorbell:

SCS bus door ring SCS bus door ring SCS bus door ring

Only the middle burst sets a destination address of 0x3, the configured number of my intercom system. I am not sure what the first and last burst indicate!

The SCS bus activity when opening the door seems more clear:

SCS bus door open SCS bus door open

These 2 bursts are sent one second apart, and only differ in the request parameter field: my guess is that 0xa4 means “start buzzing the door open” and 0xa0 means “stop buzzing the door open”.

I’m not sure why all these bursts repeat their SCS telegrams 3 times. My understanding was that SCS telegrams are repeated only when they are not acknowledged, and I indeed see no acknowledgement telegrams in my captures. Does that mean something is wrong with our intercom and it only works due to retransmissions?

SCS bus decoding with sigrok git: UART+SCS decoder

As Gerhard Sittig pointed out, in the git version of libsigrokdecode, one can use the existing UART decoder to decode SCS:

  1. Set Baud rate to 9600
  2. Set Sample point to 20%

This seems a little more robust than my cobbled-together SCS decoder from above :)

In addition to the UART decoder, we can still use a custom SCS decoder to label individual bytes within an SCS telegram according to their function, and do CRC checks.

Captured SCS telegrams

You can find my most recent captures in 2020-09-27-rohdaten-klingel-rev2.zip:

  • 2020-09-27-anlern-01-open-PUR-filtered.srzip is the door buzzer
  • 2020-09-27-anlern-02-klingel-PUR-filtered.srzip is the bell ringing

To extract the interesting parts from the sigrok files, I:

  1. Click the Show Cursors icon in PulseView’s toolbar.
  2. Position the left and right cursor edges such that the signal of interest is selected.
  3. Click the drop-down next to the Save icon and select Save Selected Range As.

Further reading

I used the following sources; please let me know of any others!

at 2020-09-28 06:43

2020-09-15

michael-herbst.com

Faraday Discussions: New horizons in density functional theory

Following the submission of our paper on a posteriori error estimation in the Kohn-Sham equations a few months ago I was recently invited to present our work at the Faraday Discussions on New horizons in density functional theory. Being amongst speakers such as Kieron Burke, Andreas Savin or Weitao Yang this was truly a great honour.

Even though the conference had to be virtual, I enjoyed it very much, especially because of its very unusual format. Unlike most other conferences where the presenting author typically does his thing for like 30 minutes, followed by just a few questions, the situation is completely reversed for the Faraday discussions. Since we had to submit our paper already months in advance (and this was shared with the other participants) the content of my talk was already known to the audience. The main chunk of the time at the conference was therefore allocated to the discussion and not to the presentation. My few slides therefore only briefly recap our work and hint at the general motivation and outlook. As I hoped our work did indeed stimulate an intense discussion with interested and stimulating questions, which I really appreciated (thanks to everyone who asked or commented). As far as I understand both the paper as well a transcript of the discussion will be part of the official Faraday Discussions conference proceedings, which will be published by the Royal Society of Chemistry soon. As per usual my slides are attached below.

Link Licence
A posteriori error estimation for the non-self-consistent Kohn-Sham equations (Slides) Creative Commons License

by Michael F. Herbst at 2020-09-15 13:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, error estimates, numerical analysis, Kohn-Sham, high-throughput

2020-09-06

RaumZeitLabor

Käfertal bleibt Käfertal bleibt Käfertal

Kaum zu glauben, aber wahr: Die Suche nach einem neuen Standort ist beendet! Das RaumZeitLabor bleibt in Käfertal und zieht 1,3 Luftlinienkilometer weiter in die Weinheimer Straße 58–60.

Hier gibt es noch einiges tun. Wände müssen versetzt, beziehungsweise gestellt werden, der Keller gehört renoviert und werkstatttauglich hergerichtet, die neue Aufteilung aller Räume muss durchdacht werden. Aber auch das “alte” RZL und die eigentliche Umzugsplanung sollten vor lauter Vorfreude auf die neue Location nicht vernachlässigt werden. 

Wir freuen uns weiterhin über Beteiligung aller Art. Wie es die kommenden Wochen weitergeht, erfahrt ihr auf der Mailingliste und sicher auch über Twitter/Mastodon.

by flederrattie at 2020-09-06 00:00

2020-09-03

michael-herbst.com

Black-box inhomogeneous preconditioning for self-consistent field iterations in density-functional theory

For the past half a year or so Antoine Levitt and myself have been looking at a particular tricky busyness for solid-state density-functional theory (DFT) calculations, namely how to design efficient self-consistent field (SCF) schemes for large inhomogeneous systems. I have already previously reported on this matter in a short talk at the seminar of our interdisciplinary working group, but now our results have reached a stage suitable for publication.

The underlying problem we are tackling in our work is that for large systems, meaning increased sizes of the unit cell, the SCF iterations become harder and harder to solve. Mathematically speaking the (spectral) condition number of the fixed point iterations underlying the SCF procedure increase rather drastically in such cases, leading to very slow convergence. For example in aluminium the number of iterations required to converge an SCF with a damped iteration scheme (the most simple one) increases quadratically with the system size. This quickly makes calculations intractable and multiple more sophisticated approaches have therefore been developed over the years. As is detailed in our work there are mainly two orthogonal directions of attack. The first is to black-box "accelerate" the convergence by using the so-called Anderson (or Pulay or DIIS) scheme. This reduces the growth of iterations with system size from quadratic to linear (in the aluminium example), which is a good start. The second approach is to use a carefully designed preconditioner for the SCF in order to tame the SCF iterations. Figuratively speaking this approach makes use of known physics to prevents the SCF from looking in the wrong direction for the solution. If done right, meaning that the physics modelled by the preconditioner fits the system at hand, this allows the SCF iteration count to become independent of system size. This latter approach is clearly the more important route to cure the problem, but both approaches are orthogonal and are therefore typically combined in order to get the fastest convergence.

Now what does it mean the preconditioner has to fit the system? As we detail in the paper, the convergence of an SCF is intimately linked to the dielectric behaviour of the material one models with the SCF. For homogeneous cases (i.e. bulk insulators, metals and semiconductors) people have devised very good models for their dielectric behaviour and have used them to construct preconditioners. As is well known (and confirmed by our study) these models show exactly the desirable property of a size-independent iteration count. The caveat is only that metals, insulators and semiconductors have deviating dielectric properties, meaning that each of these calls for a different preconditioning strategy. In return this means that heterogeneous cases where multiple of these materials are combined are difficult to treat in practice because none of the bulk recipes fully fit.

The main aim of our work was therefore to design a preconditioner which automatically and locally adapts to the system at hand, meaning that for heterogeneous cases it treats metallic regions like metals, insulating regions like insulators and so on. As we demonstrate with a number of test cases our preconditioner is able to do this completely black-box and parameter-free and performs well also for large heterogeneous systems. This is in contrast to previous approaches to tackle this problem, which were not as general as our approach and sometimes required complex hand-tuning of the involved parameters.

While our preconditioner solves the problem of efficiently treating cases like metallic slabs, metal clusters and basically any combination of metallic parts, insulators and vacuum, it is not fully capable of distinguishing insulators and semiconductors. We show that this can be cured at the expense of introducing another parameter to our algorithm. This works, but is not completely satisfactory to us. Part of our ongoing work is therefore to extend our scheme to treat mixed systems involving semiconductors as well. Another aspect we have so far neglected is spin, which is a constant annoyance for converging SCFs. Having a solid dielectric model as we propose it, also opens way to adapt preconditioning to each spin component differently. We hope to use this in the future to tackle convergence issues with spin in a hopefully more rigorous way than this is done to date.

The full abstract of our paper reads

We propose a new preconditioner for computing the self-consistent problem in Kohn-Sham density functional theory, based on the local density of states. This preconditioner is inexpensive and able to cure the long-range charge sloshing known to hamper convergence in large, inhomogeneous systems such as clusters and surfaces. It is based on a parameter-free and physically motivated approximation to the independent-particle susceptibility operator, appropriate for both metals and insulators. It can be extended to semiconductors by using the macroscopic electronic dielectric constant as a parameter in the model. We test our preconditioner successfully on inhomogeneous systems containing metals, insulators, semiconductors and vacuum.

by Michael F. Herbst at 2020-09-03 22:30 under electronic structure theory, theoretical chemistry, DFTK, Julia, DFT, numerical analysis, Kohn-Sham

2020-08-09

sECuREs website

Adding a fiber link to my home network

Motivation

Despite using a FTTH internet connection since 2014, aside from the one fiber uplink, I had always used network gear with 1 Gbit/s links over regular old rj45 cat5(e) cables.


I liked the simplicity and uniformity of that setup, but decided it’s time to add at least one fiber connection, to get rid of a temporary ethernet cable that connected my kitchen with the rest of my network that is largely in the living room and office.

The temporary ethernet cable was an experiment to verify that running a server or two in my kitchen actually works (it does!). I used a flat ethernet cable, which is great for test setups like that, as you can often tape it onto the walls and still close the doors.

So, we will replace one ethernet cable with one fiber cable and converters at each end:

0.9mm thin fiber cables

Why is it good to switch from copper ethernet cables to fiber in this case? Fiber cables are smaller and hence easier to fit into existing cable ducts. While regular ethernet cable is way too thick to fit into any of the existing ducts in my flat, I was hoping that fiber might fit!

When I actually received the cables, I was surprised how much thinner fiber cables actually can be: there are 0.9mm cables, which are so thin, they can be hidden in plain sight! I had only ever seen 2mm fiber cables before, and the 0.9mm cables are incredibly light, flexible and thin! Even pasta is typically thicker:

Preparing a delicious pot of glass noodles ;)

Preparing a delicious pot of glass noodles ;)


The cable shown above comes from the fiber store FS.COM, which different people have praised on multiple occasions, so naturally I was curious to give them a shot myself.

Also, for the longest time, it was my understanding that fiber connectors can only be put onto fiber cables using expensive (≫2000 CHF) machines. A while ago I heard about field assembly connectors so I wanted to verify that those indeed work.


Aside from practical reasons, playing around with fiber networking also makes for a good hobby during a pandemic :)

Hardware Selection

I ordered all my fiber equipment at FS.COM: everything they have is very affordable, and products in stock at their German warehouse arrive in Switzerland (and presumably other European countries) within the same week.

If you are in the luxurious position to have enough physical space and agility to pull through an entire fiber cable, without having to remove any connectors, you can make a new network connection with just a few parts:

amt price total article note
2x 36 CHF 72 CHF #17237 1 Gbit/s media converter RJ45/SFP
1x 8.5 CHF 8.5 CHF #39135 1 Gbit/s BiDi SFP 1310nm-TX/1550nm-RX
1x 11 CHF 11 CHF #39138 1 Gbit/s BiDi SFP 1550nm-TX/1310nm-RX
1x 2.3 CHF 2.3 CHF #12285 fiber cable, 0.9mm LC UPC/LC UPC simplex

I recommend buying an extra fiber cable or two so that you can accidentally damage a cable and still have enough spares.

Total cost thus far: just under 100 CHF. If you have existing switches with a free SFP slot, you can use those instead of the media converters and save most of the cost.


If you need to temporarily remove one or both of the fiber cable connector(s), you also need field assembly connectors and a few tools in addition:

amt price total article note
2x 4 CHF 8 CHF #35165 LC/UPC 0.9mm pre-polished field assembly connector
1x 110 CHF 110 CHF #14341 High Precision Fibre Optic Cleaver FS-08C
1x 26 CHF 26 CHF #14346 Fibre Optic Kevlar Cutter
1x 14 CHF 14 CHF #72812 Fibre Optical Stripper

I recommend buying twice the number of field assembly connectors, for practicing.

Personally, I screwed up two connectors before figuring out how the process goes.

Total cost: about 160 CHF for the field assembly equipment, so 260 CHF in total.


To boost your confidence in the resulting fiber, the following items are nice to have, but you can get by without, if you’re on a budget.

price article note
18 CHF #35388 FVFL-204 Visual Fault Locator
9.40 CHF #82730 2.5mm to 1.25mm adapter for Visual Fault Locator
4.10 CHF #14010 1.25mm fiber clean swabs (100pcs)

With the visual fault locator, you can shine a light through your fiber. You can verify correct connector assembly by looking at how the light comes out of the connector.

The fiber cleaning swabs are good to have in general, but for the field assembly connector, you need to use alcohol-soaked wipes anyway (which FS.COM does not stock).

The total cost for everything is just under 300 CHF.

Hardware Selection Process

The large selection at FS.COM can be overwhelming to navigate at first. My selection process went something like this:

My first constraint is using bi-directional (BiDi) fiber optics modules so that I only need to lay a single fiber cable, as opposed to two fiber cables.

The second constraint is to use field assembly connectors.

If possible, I wanted to use bend-insensitive fiber so that I wouldn’t need to pay so much attention to the bend radius and have more flexibility in where and how I can lay fiber.

With these constraints, there aren’t too many products left to combine. An obvious and good choice are 0.9mm fiber cable using LC/UPC connectors.

FS.COM details

As of 2020-08-05, FS.COM states they have 5 warehouses in 4 locations:

  • Delaware (US)
  • Munich (Germany)
  • Melbourne (Australia)
  • Shenzhen (China)

They recently built another, bigger (7 km²) warehouse in Shenzhen, and now produce inventory for the whole year.

By 2019, FS.COM had over 300,000 registered corporate customers, reaching nearly 200 million USD yearly sales.

Delivery times

As mentioned before, delivery times are quick when the products are in stock at FS.COM’s German warehouse.

In my case, I put in my order on 2020-Jun-26.

The items that shipped from the German warehouse arrived on 2020-Jul-01.

Some items had to be manufactured and/or shipped from Asia. Those items arrived after 3 more weeks, on 2020-Jul-24.

Unfortunately, FS.COM doesn’t stock any 0.9mm fiber cables in their German warehouse right now, so be prepared for a few weeks of waiting time.

Laying The Fiber

Use a cable puller to pull the fiber through existing cable ducts where possible.

  • In general, buy the thinnest one you can find. I have this 4mm diameter cable puller, but a 3mm or even 2mm one would work in more situations.

  • I found it worthwhile to buy a brand one. It is distinctly better to handle (less stiff, i.e. more flexible) than the cheap one I got, and thinner, too, which is always good.

In my experience, it generally did not work well to push the fiber into an existing duct or alongside an existing cable. I really needed a cable puller.

If you’re lucky and have enough space in your duct(s), you can leave the existing connectors on the fiber. I have successfully just used a piece of tape to fix the fiber connector on the cable puller, pushing down the nose temporarily:

fiber cable taped to cable puller

Where there are no existing ducts, you may need to lay the fiber on top of the wall. Obviously, this is tricky as soon as you need to make a connection going through a wall: whereas copper ethernet cables can be bent and squeezed into door frames, you quickly risk breaking fiber cables.

Luckily, the fiber is very light, so it’s very easy to fix to the wall with a piece of tape:

fiber cables on the wall

You can see the upstream internet fiber in the top right corner, which is rather thick in comparison to my 0.9mm yellow fiber that’s barely visible in the middle of the picture.

Note how the fiber entirely disappears behind the existing duct atop the door!

Above, you can see the flat ethernet cable I have been using as a temporary experiment.


Where there is an existing cable that you can temporarily remove, it might be possible to remove it, put the fiber in, and put the old cable back in, too. This is possible because the 0.9mm fiber cable is so thin!

I’m using this technique to cross another wall where the existing cable duct is too full, but there is a cable that can be removed and put back after pulling the fiber through:

fiber cable next to existing cable

…and on the other side of the wall:

fiber cable next to existing socket

Note how the fiber is thin enough to fit between the socket and duct!


Note: despite measuring how long a fiber cable I would need, my cable turned out too short! While the cable was just as long as I had measured, with distances exceeding 10m, it is a good idea to add a few meters spare on each side of the connection.

Field assembly connectors

To give you an overview, these are the required steps at a high level:

  1. Cut the fiber with the Fibre Optic Kevlar Cutter
  2. Strip the fiber with the Fibre Optical Stripper
  3. Put the field assembly jacket onto the fiber
  4. Cut the stripped fiber cleanly with the High Precision Fibre Optic Cleaver FS-08C
  5. Put the field assembly connector onto the fiber

I thought the following resources were useful:

  1. Pictograms: PDF: FS.COM LC UPC field assembley connectors quick start guide
  2. Pictures: Installation Procedure on FS.COM
  3. Video: YouTube: Terminate Fiber in 5 Minutes: this video shows a different product, but I found it helpful to see any field assembly connector on video, and this is one of the better videos I could find.

Beware: the little paper booklet that comes with the field assembly connector contains measurements which are not to scale. I have suggested to FS.COM that they fix this, but until then, you’ll need to use e.g. a tape measure.


For establishing an intuition of their different sizes, here are the different connectors:

fiber cable connectors

From left to right:

  • 2.0mm fiber cable
  • cat6 ethernet cable
  • 0.9mm fiber cable (LC/UPC factory)
  • 0.9mm fiber cable (LC/UPC field assembly connector)

The 0.9mm fiber cables come with smaller connectors than the 2.0mm fiber cables, and that alone might be a reason to prefer them in some situations.

The field assembly connectors are pretty bulky in comparison, but since you can attach them yourself after pulling only the cable through the walls and/or ducts, you usually don’t care too much about their size.

Conclusion

Modern fiber cables available at FS.COM are:

  • thinner than I expected
  • more robust than I expected
  • cheaper than I expected
  • survive tighter bend radiuses than I expected

Replacing this particular connection with a fiber connection was a smooth process overall, and I would recommend it in other situations as well.


I would claim that it is totally feasible for anyone with an hour of patience to learn how to put a field assembly connector onto a fiber cable.

If labor cost is expensive in your country or you just like doing things yourself, I can definitely recommend this approach. In case you mess the connector up and don’t want to fix it yourself, you can always call an electrician!


Stay tuned for the next part, where I upgrade the 1G link to a 10G link!

at 2020-08-09 12:53

2020-07-31

michael-herbst.com

DFTK: A Julian approach for simulating electrons in solids

Since last Friday I have been attending JuliaCon, the annual conference for the Julia language. Naturally given the current situation the event did not take place "on location", but was instead converted into a virtual event. Albeit the different feel compared to a real-life conference the organisers did a very good job to maintain the social component into the event. Talks were pre-recorded and speakers available in a chat room to discuss during and after the presentation in written form. Birds of feather brainstorming sessions took place using audio discussions and at the end of every day there was a Gather Town virtual social, where one could videochat with fellow attendees by meeting up in a beautifully animated world, where each attendee was represented by a tiny avatar.

Apart from attending and listing to the great talks about the Julia language and its plenty of applications, I also had the chance to actively participate by giving a lecture about our package DFTK.jl. While I have presented on DFTK a few times before in front of expert audiences of the field, it was really the first time I presented DFTK as a released package to the broader Julia audience. That meant that I could, for once, give up on my usual storyline where I try and convince people into using Julia and instead focus on providing insight into the fascinating challenges of electronic-structure theory and how DFTK and Julia are ideal tools to tackle these.

In my talk I start easy by a general introduction into electronic-structure theory illustrating why an exact solution for electronic structures in molecules or solids is just not possible in realistic timeframes. Therefore one needs to live with approximate models, one example being density-functional theory (DFT), which we use in DFTK. As I detail in the talk an almost immediate consequence of the complexity of the problem is that advances in electronic-structure theory can typically only be realised if multiple disciplines join forces. An interdisciplinary project, however, brings some practical problems just quite frankly due to the fact that different fields have different approaches when tackling a problem. Being able to support such multidisciplinary motions in a common software platform for DFT, is one of the key aims of DFTK.

Related to this point we wanted DFTK to have a low entrance barrier for novel researchers. As time and money in research is tight programs should be easy to use and code simple and self-explanatory, such that new PhD students or researchers from foreign fields do not have a tough time to get started. In my talk I mention a few recent projects (an undergrad internship and a master project), where a noteworthy result could be achieved albeit students had little prior experience with neither Julia nor electronic-structure theory. A similar success story emphasising our ability to rapidly realise novel ideas in DFTK includes our recently published Faraday paper, where it only took 10 weeks from starting the project to submitting the paper.

Lastly, I discussed challenges arising from the so-called high-throughput screening methods, which are recently gaining popularity in computational materials design. In this particular research direction algorithms need to be particularly robust and tunable to find a sweet spot between accuracy and computational cost. This demands extremely stable and reliable algorithms, which poses interesting mathematical problems in numerical analysis and e.g. with respect to designing estimators for discretisation error. Especially in this area of application-oriented mathematical research we expect DFTK to be a handy tool in the future.

If you are interested in the full story a recording of the talk is available on youtube.

Link Licence
DFTK: A Julian approach for simulating electrons in solids (Slides) Creative Commons License
Youtube recording of the talk

by Michael F. Herbst at 2020-07-31 20:00 under talk, electronic structure theory, Julia, HPC, DFTK, theoretical chemistry, SCF, high-throughput

2020-07-20

Mero’s Blog

Parametric context

tl;dr: Go's Context.Value is controversial because of a lack of type-safety. I design a solution for that based on the new generics design draft.

If you are following what's happening with Go, you are aware that recently an updated design draft for generics has dropped. What makes this particularly notable is that it comes with an actual prototype implementation of the draft, including a playground. This means for the first time, people get to actually try out how a Go with generics might feel, once they get in. It is a good opportunity to look at common Go code lacking type-safety and evaluate if and how generics can help address them.

One area I'd like to look at here is Context.Value. It is often criticized for not being explicit enough about the dependencies a function has and some people even go so far as to discourage its use altogether. On the other hand, I'm on record saying that it is too useful to ignore. Generics might be a way to bring together these viewpoints.

We want to be able to declare dependency on a functionality in context.Context via a function's signature and make it impossible to call it without providing that functionality, while also preserving the ability to pass it through APIs that don't know anything about it. As an example of such functionality, I will use logging. Let's start by creating a fictional little library to do that (the names are not ideal, but let's not worry about that):

package logctx

import (
    "context"
    "log"
)

type LogContext interface {
    // We embed a context.Context, to say that we are augmenting it with
    // additional functionality.
    context.Context

    // Logf logs the given values in the given format.
    Logf(format string, values ...interface{})
}

func WithLog(ctx context.Context, l *log.Logger) LogContext {
    return logContext{ctx, l}
}

// logContext is unexported, to ensure it can't be modified.
type logContext struct {
    context.Context
    l *log.Logger
}

func (ctx logContext) Logf(format string, values ...interface{}) {
    ctx.l.Printf(format, values...)
}

You might notice that we are not actually using Value() here. This is fundamental to the idea of getting compiler-checks - we need some compiler-known way to "tag" functionality and that can't be Value. However, we provide the same functionality, by essentially adding an optional interface to context.Context.

If we want to use this, we could write

func Foo(ctx logctx.LogContext, v int) {
    ctx.Logf("Foo(%v)", v)
}

func main() {
    ctx := logctx.WithLog(context.Background(), log.New(os.Stderr, "", log.LstdFlags))
    Foo(ctx, 42)
}

However, this has a huge problem: What if we want more than one functionality (each not knowing about the other)? We might try the same trick, say

package tracectx

import (
    "context"

    "github.com/opentracing/opentracing-go"
)

type TraceContext interface {
    context.Context
    Tracer() opentracing.Tracer
}

func WithTracer(ctx context.Context, t opentracing.Tracer) TraceContext {
    return traceContext{ctx, t}
}

type traceContext struct {
    context.Context
    t opentracing.Tracer
}

func (ctx traceContext) Tracer() opentracing.Tracer {
    return ctx.t
}

But because a context.Context is embedded, only those methods explicitly mentioned in that interface are added to traceContext. The Logf method is erased. After all, that is the trouble with optional interfaces.

This is where generics come in. We can change our wrapper-types and -functions like this:

type LogContext(type parent context.Context) struct {
    // the type-parameter is lower case, so the field is not exported.
    parent
    l *log.Logger
}

func WithLog(type Parent context.Context) (ctx Parent, l *log.Logger) LogContext(Parent) {
    return LogContext(parent){ctx, l}
}

By adding a type-parameter and embedding it, we actually get all methods of the parent context on LogContext. We are no longer erasing them. After giving the tracectx package the same treatment, we can use them like this:

// FooContext encapsulates all the dependencies of Foo in a context.Context.
type FooContext interface {
    context.Context
    Logf(format string, values ...interface{})
    Tracer() opentracing.Tracer
}

func Foo(ctx FooContext, v int) {
    span := ctx.Tracer().StartSpan("Foo")
    defer span.Finish()

    ctx.Logf("Foo(%v)", v)
}

func main() {
    l := log.New(os.Stderr, "", log.LstdFlags)
    t := opentracing.GlobalTracer()
    // ctx has type TraceContext(LogContext(context.Context)),
    //    which embeds a LogContext(context.Context),
    //    which embeds a context.Context
    // So it has all the required methods
    ctx := tracectx.WithTracer(logctx.WithLog(context.Background(), l), t)
    Foo(ctx, 42)
}

Foo has now fully declared its dependencies on a logger and a tracectx, without requiring any type-assertions or runtime-checks. The logging- and tracing-libraries don't know about each other and yet are able to wrap each other without loss of type-information. Constructing the context is not particularly ergonomic though. We require a long chained function call, because the values returned by the functions have no longer a unified type context.Context (so the ctx variable can't be re-used).

Another thing to note is that we exported LogContext as a struct, instead of an interface. This is necessary, because we can't embed type-parameters into interfaces, but we can embed them as struct-fields. So this is the only way we can express that the returned type has all the methods the parameter type has. The downside is that we are making this a concrete type, which isn't always what we want¹.

We have now succeeded in annotating context.Context with dependencies, but this alone is not super useful of course. We also need to be able to pass it through agnostic APIs (the fundamental problem Context.Value solves). However, this is easy enough to do.

First, let's change the context API to use the same form of generic wrappers. This isn't backwards compatible, of course, but this entire blog post is a thought experiment, so we are ignoring that. I don't provide the full code here, for brevity's sake, but the basic API would change into this:

package context

// CancelContext is the generic version of the currently unexported cancelCtx.
type CancelContext(type parent context.Context) struct {
    parent
    // other fields
}

func WithCancel(type Parent context.Context) (ctx Parent) (ctx CancelContext(Parent), cancel CancelFunc) {
    // ...
}

This change is necessary to enable WithCancel to also preserve methods of the parent context. We can now use this in an API that passes through a parametric context. For example, say we want to have an errgroup package, that passes the context through to the argument to (*Group).Go, instead of returning it from WithContext:

// Derived from the current errgroup code.

// A Group is a collection of goroutines working on subtasks that are part of the same overall task.
//
// A zero Group is invalid (as opposed to the original errgroup).
type Group(type Context context.Context) struct {
    ctx    Context
    cancel func()

    wg sync.WaitGroup

    errOnce sync.Once
    err     error
}

func WithContext(type C context.Context) (ctx C) *Group(C) {
    ctx, cancel := context.WithCancel(ctx)
    return &Group(C){ctx: ctx, cancel: cancel}
}

func (g *Group(Context)) Wait() error {
    g.wg.Wait()
    return g.err
}

func (g *Group(Context)) Go(f func(Context) error) {
    g.wg.Add(1)

    go func() {
        defer g.wg.Done()

        if err := f(g.ctx); err != nil {
            g.errOnce.Do(func() {
                g.err = err
            })
        }
    }()
}

Note that the code here has barely changed. It can be used as

func Foo(ctx FooContext) error {
    span := ctx.Tracer().StartSpan("Foo")
    defer span.Finish()
    ctx.Logf("Foo was called")
}

func main() {
    var ctx FooContext = newFooContext()
    eg := errgroup.WithContext(ctx)
    for i := 0; i < 20; i++ {
        eg.Go(Foo)
    }
    if err := eg.Wait(); err != nil {
        log.Fatal(err)
    }
}

After playing around with this for a couple of days, I feel pretty confident that these patterns make it possible to get a fully type-safe version of context.Context, while preserving the ability to have APIs that pass it through untouched or augmented.

A completely different question, of course, is whether all of this is a good idea. Personally, I am on the fence about it. It is definitely valuable, to have a type-safe version of context.Context. And I think it is impressive how small the impact of it is on the users of APIs written this way. The type-argument can almost always be inferred and writing code to make use of this is very natural - you just declare a suitable context-interface and take it as an argument. You can also freely pass it to functions taking a pure context.Context unimpeded.

On the other hand, I am not completely convinced the cost is worth it. As soon as you do non-trivial things with a context, it becomes a pretty "infectious" change. For example, I played around with a mock gRPC API to allow interceptors to take a parametric context and it requires almost all types and functions involved to take a type-parameter. And this doesn't even touch on the fact that gRPC itself might want to add annotations to the context, which adds even more types. I am not sure if the additional machinery is really worth the benefit of some type-safety - especially as it's not always super intuitive and easily understandable. And even more so, if it needs to be combined with other type-parameters, to achieve other goals.

I think this is an example of what I tend to dislike about generics and powerful type-systems in general. They tempt you to write a lot of extra machinery and types in a way that isn't necessarily semantically meaningful, but only used to encode some invariant in a way the compiler understands.


[1] One upside however, is that this could actually address the other criticism of context.Value: Its performance. If we consequently embed the parent-context as values in struct fields, the final context will be a flat struct. The interface-table of all the extra methods we add will point at the concrete implementations. There's no longer any need for a linear search to find a context value.

I don't actually think there is much of a performance problem with context.Value in practice, but if there is, this could solve that.

at 2020-07-20 00:00

2020-07-09

sECuREs website

Introducing the kinT kinesis keyboard controller

Kinesis Advantage ergonomic keyboard

Back in 2013, I published a replacement controller for the Kinesis Advantage ergonomic keyboard. In the community, it is often referred to simply as the “stapelberg”, and became quite popular.

Many people like to use the feature-rich QMK firmware, which supports my replacement controller out of the box.

kinesis pcb mounted

On eBay, you can frequently find complete stapelberg kits or even already-modified Kinesis keyboards including the stapelberg board for sale.

In 2017, Kinesis released the Kinesis Advantage 2, which uses a different connector (an FPC connector) for connecting the two thumb pad PCBs to the controller PCB, instead of the soldered cable the older Kinesis Advantage used. Aside from the change in connector and cable type, the newer keyboard uses the same pinout as the old one.

I wanted to at least update my project to support the Kinesis Advantage 2. While doing so, I decided to also make a bunch of improvements to make the project more approachable and usable for beginners. Among many other improvements, the project switched from Eagle to KiCad, which is FOSS and means no more costly license fees!

kinT (T for Teensy!)

I am hereby announcing the kinT kinesis keyboard controller: a replacement keyboard controller for your Kinesis Advantage or Advantage 2 ergonomic keyboards.

kinT keyboard controller

The Teensy footprint looks a bit odd, but it’s a combined footprint so that you can use the same board with many different Teensy microcontrollers, giving you full flexibility regarding cost and features. See “Compatibility: which Teensy to use?” for more details.


I originally replaced the controller of my Kinesis Advantage to work around a bug, but these days I do most of it just because I enjoy tinkering with keyboards.

You might consider to replace your keyboard controller for example…

Building your own kinT keyboard controller

  1. Follow “Buying the board and components (Bill of materials)”. When ordering from OSH Park (board) and Digi-Key (components), you’ll get the minimum quantity of 3 boards for 72 USD (24 USD per board), and one set of components for 49 USD.

    • If you have any special requirements regarding which Teensy microcontroller to use, this is the step where you would replace the Teensy 3.6 with your choice.
  2. Wait for the components to arrive. When ordering from big shops like Digi-Key or Mouser, this typically takes 2 days to many places in the world.

  3. Wait for the boards to arrive. This takes 6 days in the best case when ordering from OSH Park with their Super Swift Service option. In general, the longer you are willing to wait, the cheaper it is going to get.

  4. Follow the soldering guide. This will take about an hour.

  5. Install the firmware

Improvements over the older replacement board

In case you’re familiar with the older replacement board and are wondering what changed, here is a complete list:

  • The kinT supports both, the older Kinesis Advantage (KB500) and the newer Kinesis Advantage 2 (KB600) keyboards. They differ in how the thumb pads are connected. See the soldering instructions below.

  • The kinT is made for the newer Teensy 3.x and 4.x series, which will remain widely available for years to come, whereas the future of the Teensy++ 2.0 is not as certain.

  • The kinT is a smaller PCB (4.25 x 3.39 inches, or 108.0 x 86.1 mm), which makes it:

    • more compact: can be inserted/removed without having to unscrew a key well.

    • cheaper: 72 USD for 3 boards at oshpark, instead of 81 USD.

  • The kinT silkscreen (front, back) and schematic are much much clearer, making assembly a breeze.

  • The kinT is a good starting point for your own project:

    • kinT was designed in the open source KiCad program, meaning you do not need any license subscriptions.

    • The clear silkscreen and schematic make development and debugging easier.

  • On the kinT, the Teensy no longer has to be soldered onto the board upside down.

  • On the kinT, the FPC connectors have been moved for less strain on the cables.

  • The kinT makes possible lower-cost builds: if you don’t need the scroll lock, num lock and keypad LEDs, you can use a Teensy LC for merely 11 USD.

Conclusion

I’m very excited to release this new keyboard controller, and I can’t wait to see all the custom builds and modifications!

By the way, there is also a (4-hour!) stream recording in case you are interested in some more history and context, and want to see me solder a kinT controller live on stream!

at 2020-07-09 07:25

2020-07-01

RaumZeitLabor

Will hack for space

Das Jahr 2020 und seine schlechten Neuigkeiten hören nicht auf. Zu allem Übel sind uns jetzt auch noch unsere Räumlichkeiten zum 31. Oktober 2020 gekündigt worden.

Damit wir passend zu Halloween eine Einweihungsparty feiern können, brauchen wir eure Mithilfe bei der Suche nach einem neuen Zuhause für den Verein und seinen Maschinenpark.

Was wir uns für ein zukünftiges RaumZeitLabor wünschen würden:

  • Büro- und Werkstatt-/Hallen-Kombination (wir brauchen wieder Arbeitsbereich und Platz für unsere Werkstätten)
  • Mindestens 200m², besser ein paar mehr m² Platz oder eine Option auf später Erweiterung
  • Küche oder Anschlüsse, um eine aufstellen zu können
  • Sanitäre Anlagen und Heizung
  • Einigermaßen guter ÖPNV-Anschluss, Parkmöglichkeiten in der Nähe
  • Internet, schnell

Bonus:

  • provisionsfrei
  • wenige Nachbarn, oder Nachbarn, die nur tagsüber vor Ort sind
  • barrierearmer Zugang

Solltet ihr einen Hinweis auf eine solche Immobilie haben, lasst es uns wissen und schreibt uns gern unter vorstand@raumzeitlabor.de an.

Die kommenden vier Monate heißt es jetzt „Auf die Kartons, fertig, los!“. Kommt bitte vorbei, nehmt eure privaten Sachen mit nach Hause und helft uns, alles umzugssicher zu verpacken.

Vielen Dank!

by flederrattie at 2020-07-01 00:00

2020-06-28

Insanity Industries

Socket activating arbitrary services

Socket activation is the idea of activating a daemon or service not by manually starting it, but by merely pre-exposing the socket that is used for communication with that service. This has several advantages:

  • system boot speeds up as less things need to actually be started at boot time
  • system resource usage is reduced as less services actually run1
  • you can restart services behind the socket invisible to the client (if the service gracefully takes up the connection on the socket again) without loosing messages2

The two requirements for this are:

  • a daemon that manages these sockets and activates the corresponding process once communication happens
  • the socket-activated daemon being able to work with a preestablished socket-connection

For the first part, systemd got us covered. The second part must be implemented in the daemon code, but we can regardless activate any socket based service, even if they don’t implement socket activation themselves with a little help. Let’s take a look at how things work in this order.

Systemd socket activation: The principle

This will be a brief rundown on how systemd’s socket activation works (see here for all the details):

To activate a service via its socket, we need to define two units: the service and the socket. A socket can be a network socket, a unix socket and several other connection types. In this post, however, we will exemplarily focus on network sockets. We place the service file of our socket-activatable service (we will come back to what this means in a moment) as /etc/systemd/system/myservice.service:

[Unit]
Description=Socket-activatable example service

[Service]
ExecStart=/usr/bin/myservice
NonBlocking=True

and we place the corresponding socket unit (in this case a TCP-network-socket) as /etc/systemd/system/myservice.socket:

[Unit]
Description=Socket for example service

[Socket]
# In this example we only listen on localhost
ListenStream=127.0.0.1:1234
NoDelay=true

[Install]
WantedBy=sockets.target

Note that the service does not require an install section, only the socket does3. We then enable our socket via systemctl enable --now myservice.socket.

Now we have an inactive service unit, but an active socket which is set up by systemd. The moment any activity occurs on this socket, systemd will start myservice.service and hand over the socket, buffering what is already written to it until the service takes over.

Leveraging socket activation in practice

In practice, we can distinguish three different cases:

1. Socket activation natively supported already

If the service in question already natively supports socket activation, simply activate the corresponding socket unit (or create one if it doesn’t exist yet). See systemctl list-units | grep socket for socket-units already available on the system.

2. Fault tolerance and boot time optimization desired only

If our service is intended to be started at boot and socket activation is intended only to provide transparent restarts and boot parallelization, we can simply use systemd’s systemd-socket-proxyd. See man systemd-socket-proxyd or here for details and usage examples.

Make sure you add an [Install]-section to the service file and enable the service itself as well then as systemd-socket-proxyd merely decouples the service unit from its socket, but doesn’t start it automatically.

3. On-demand starting and stopping of arbitrary services

If our service does not support socket activation natively and we want it to start not at boot time, but on-demand, we can use the tool socket-activate. For this, configure actual.service to listen on localhost on a different port such as 127.0.0.1:12345 and create two units: socket-activate-actual.service containing

[Unit]
Description=Socket-activate proxy for example service

[Service]
ExecStart=/usr/bin/socket-activate -u "actual.service" -a "127.0.0.1:12345"
NonBlocking=True

as well as a corresponding socket-activate-actual.socket as noted above, listening on the actual desired port. On the first connection to the socket, socket-activate-actual.service will be started and in turn start actual.service, proxying all traffic to it.

If socket-activate is invoked with an additional -t <timeout>, then both socket-activate-actual.service as well as actual.service are stopped again when no activity is detected for the specified timeout.


  1. Especially as socket-activated services can terminate once their job is done, as they simply get reactivated next time someone connects to their communication socket again. ↩︎

  2. This includes crashes or upgrades of the service, no further data written to the socket will be lost, only the data the service has already read from it before it crashed or was restarted. ↩︎

  3. The service may still have an [Install] section to be started conventionally or already being started even if no one has yet connected to its socket. In the latter case the socket activation would primarily serve as restart resilience and boot parallelization. ↩︎

by Jonas Große Sundrup at 2020-06-28 15:23

2020-06-26

michael-herbst.com

SCF preconditioning for mixed systems

Fortunately the general lockdown due to the Corona pandemic slowly starts to ease around Paris as well. While basically all seminars are only virtual it is good to see old procedures and habits to slowly return. From my end I gave the first talk after the forced break today in the EMC2 group meeting.

Since it was the first time discussing DFTK in the EMC2 synergy group I decided to talk about it taking the angle of tackling an actual research problem. After presenting briefly DFT methods and DFTK in the first half of the talk, I therefore focused on one of my ongoing projects with Antoine Levitt, namely constructing better preconditioners for the self-consistent field (SCF) iterations in mixed systems. What is meant by mixed systems are systems where locally differing dielectric properties are found, i.e. where some parts of the material are insulating, others may be metallic or semiconductors. Since the dielectric properties are closely related to the spectrum of the SCF fixed-point map, they therefore also control the convergence properties of SCF procedures. For metals and (to a minor extent) semiconductors simple SCF procedures, where one just applies the SCF cycle over and over require extremely small step sizes (i.e. small damping values). As a result the SCF converges only very slowly. The remedy is to precondition the spectrum of the SCF map itself by using so-called mixing techniques. In state-of-the-art approaches these are usually material-specific, i.e. different mixings are used for insulators, metals or semiconductors. This is fine for bulk materials, but fails in case of mixed systems, since one has to globally select a single approach. Our recent work has been to investigate the spectrum of the SCF map and to try and construct a preconditioner which locally adapts and as a result is able to properly treat mixed systems as well. The results I presented today are, however, not yet final and more investigations are to be underdone for our approach to work as reliable as we want.

Link Licence
SCF preconditioning for mixed systems: A DFTK case study (Slides) Creative Commons License
A few DFTK examples (Jupyter notebook) GNU GPL v3
SCF preconditioners in 1D (Jupyter notebook) GNU GPL v3

by Michael F. Herbst at 2020-06-26 16:00 under talk, electronic structure theory, Julia, DFTK, theoretical chemistry, SCF

2020-06-06

sECuREs website

Using the iPhone camera as a Linux webcam with v4l2loopback

iPhone camera setup

For my programming stream at twitch.tv/stapelberg, I wanted to add an additional camera to show test devices, electronics projects, etc. I couldn’t find my old webcam, and new ones are hard to come by currently, so I figured I would try to include a phone camera somehow.

The setup that I ended up with is:

iPhone camera
→ Instant Webcam
→ WiFi
→ gstreamer
→ v4l2loopback
→ OBS

Disclaimer: I was only interested in a video stream! I don’t think this setup would be helpful for video conferencing, due to lacking audio/video synchronization.

iPhone Software: Instant Webcam app

I’m using the PhobosLab Instant Webcam (install from the Apple App Store) app on an old iPhone 8 that I bought used.

There are three interesting related blog posts by app author Dominic Szablewski:

  1. MPEG1 Video Decoder in JavaScript (2013-May)
  2. HTML5 Live Video Streaming via WebSockets (2013-Sep)
  3. Decode it like it’s 1999 (2017-Feb)

As hinted at in the blog posts, the way the app works is by streaming MPEG1 video from the iPhone (presumably via ffmpeg?) to the jsmpeg JavaScript library via WebSockets.

After some git archeology, I figured out that jsmpeg was rewritten in commit 7bf420fd just after v0.2. You can browse the old version on GitHub.

Notably, the Instant Webcam app seems to still use the older v0.2 version, which starts WebSocket streams with a custom 8-byte header that we need to strip.

Linux Software

Install the v4l2loopback kernel module, e.g. community/v4l2loopback-dkms on Arch Linux or v4l2loopback-dkms on Debian. I used version 0.12.5-1 at the time of writing.

Then, install gstreamer and required plugins. I used version 1.16.2 for all of these:

Lastly, install either websocat or wsta for accessing WebSockets. I successfully tested with websocat 1.5.0 and wsta 0.5.0.

Streaming

First, load the v4l2loopback kernel module:

% sudo modprobe v4l2loopback video_nr=10 card_label=v4l2-iphone

Then, we’re going to use gstreamer to decode the WebSocket MPEG1 stream (after stripping the custom 8-byte header) and send it into the /dev/video10 V4L2 device, to the v4l2loopback kernel module:

% websocat --binary ws://iPhone.lan/ws | \
  dd bs=8 skip=1 | \
  gst-launch-1.0 \
    fdsrc \
    ! queue \
    ! mpegvideoparse \
    ! avdec_mpeg2video \
    ! videoconvert \
    ! videorate \
    ! 'video/x-raw, format=YUY2, framerate=30/1' \
    ! v4l2sink device=/dev/video10 sync=false

Here are a couple of notes about individual parts of this pipeline:

  • You must set websocat (or the alternative wsta) into binary mode, otherwise they will garble the output stream with newline characters, resulting in a seemingly kinda working stream that just displays garbage. Ask me how I know.

  • The queue element uncouples decoding from reading from the network socket, which should help in case the network has intermittent troubles.

  • Without enforcing framerate=30/1, you cannot cancel and restart the gstreamer pipeline: subsequent invocations will fail with streaming stopped, reason not-negotiated (-4)

  • Setting format YUY2 allows ffmpeg-based decoders to play the stream. Without this setting, e.g. ffplay will fail with [ffmpeg/demuxer] video4linux2,v4l2: Dequeued v4l2 buffer contains 462848 bytes, but 460800 were expected. Flags: 0x00000001.

  • The sync=false property on v4l2sink plays frames as quickly as possible without trying to do any synchronization.

Now, consumers such as OBS (Open Broadcaster Software), ffplay or mpv can capture from /dev/video10:

% ffplay /dev/video10
% mpv av://v4l2:/dev/video10 --profile=low-latency

Debugging

Hopefully the instructions above just work for you, but in case things go wrong, maybe the following notes are helpful.

To debug issues, I used the GST_DEBUG_DUMP_DOT_DIR environment variable as described on Debugging tools: Getting pipeline graphs. In these graphs, you can quickly see which pipeline elements negotiate which caps.

I also used the PL_MPEG example program to play the supplied MPEG test file. PL_MPEG is written by Dominic Szablewski as well, and you can read more about it in Dominic’s blog post MPEG1 Single file C library. I figured the codec and parameters might be similar between the different projects of the same author and used this to gain more confidence into the stream parameters.

I also used Wireshark to look at the stream traffic to discover that websocat and wsta garble the stream output by default unless the --binary flag is used.

at 2020-06-06 09:18

2020-06-05

michael-herbst.com

First release of the density-functional toolkit (DFTK)

After we released a preliminary snapshot of DFTK last year our focus in the first half of this year was on using it for some new science. Recently, however, we got back into polishing our code base and our documentation in order to get DFTK ready for a wider audience. Today I am proud to announce that DFTK 0.1.0 has been accepted into the Julia Package repository, such that the package can now be readily installed within the Julia ecosystem.

Let me take this opportunity to recapitulate on DFTK: I started the code with Antoine Levitt about a year ago when I moved to Paris. What we had in mind was to create a simple platform for methodological developments in density functional theory (DFT). Clearly our code should support the interdisciplinary requirements of the field, where advances are often the result from devising chemically and physically sound models, using mathematical insight for suggesting stable algorithms and then scaling them up to the high-performance regime. This means that we would need both (a) the flexibility to mix and match models and numerical approaches by keeping the code high-level and similar to a scripting language and (b) access to the usual tricks (vectorisation, GPUs, threading, distributed computing) to tweak performance down to the metal.

In Julia we found a language which suits these aims perfectly. This is illustrated by the fact that after only a good year of development we already support a sizable number of features in only about 5k lines of source code. Right now the focus of DFTK is on DFT ground-state simulations for solids (LDA/GGA in a plane-wave basis with GTH pseudopotentials) with more to come. Special care is taken to have a simple and clean codebase, well-commented and suitable for teaching or extensions (other models, basis, etc.). DFT is not hard-coded, and other similar models can be computed with DFTK (for instance, the 2D Gross-Pitaevskii equation with a magnetic field). Nevertheless, the performance is comparable with that of established plane-wave DFT codes, usually within a factor of 2. DFTK is fully multithreaded, although not distributed (yet). We also include interfaces with various codes (ASE, pymatgen, abipy...) for easy workflows and to integrate to the world beyond the Julia ecosystem. See for example the asedftk python package, which integrates DFTK into the atomistic simulation environment.

The code is of course fully open source and installation is easy. Since it is intended as a platform for multidisciplinary collaboration, we welcome any question, suggestion or addition. Feel free to get in touch by opening an issue at any time.

by Michael F. Herbst at 2020-06-05 08:00 under programming and scripting, DFTK, DFT, electronic structure theory, Julia

2020-06-03

michael-herbst.com

Recent developments in adcc

Since the publication of the adcc paper a few months back, there are a few updates to report briefly:

  • adcc can now be interactively tried in the browser at try.adc-connect.org using the infrastructure from the binder project.
  • Binary installation of adcc is now available via conda for Linux and MacOS, see the adcc documentation.
  • Calculation of rotatory strengths at all ADC levels.
  • adcc is now fully integrated into the Psi4 quantum chemistry package. This means that ADC calculations in adcc can now be directly started from within the Psi4 ecosystem, including Psi4's python frontend and input files. This effectively equips Psi4 with all ADC capabilities adcc offers. See some details in the recent Psi4 paper.
  • Tensor evaluations in adcc are now lazy, which means that complex tensor evaluation expressions can be coded up in python without being evaluated. Only once results are needed the complete expression is evaluated in a batch using the underlying linear algebra frameworks.

by Michael F. Herbst at 2020-06-03 17:00 under electronic structure theory, theoretical chemistry, adcc, algebraic-diagrammatic construction

2020-05-23

sECuREs website

stapelberg uses this: my 2020 desk setup

Desk setup

I generally enjoy reading the uses this blog, and recently people have been talking about desk setups in my bubble (and on my Twitch stream), so I figured I’d write a post about my current setup!

Desk setup

I’m using a desk I bought at IKEA well over 10 years ago. I’m not using a standing desk: while I have one at work, I never change its height. Just never could get into the habit.

I was using an IKEA chair as well for many years.

Currently, I’m using a Haworth Comforto 89 chair that I bought second-hand. Unfortunately, the arm rests are literally crumbling apart and the lumbar back support and back rest in general are not as comfortable as I would like.

Hence, I recently ordered a Vitra ID Mesh chair, which I have used for a couple of years at the office before moving office buildings. It will take a few weeks before the chair arrives.

Full Vitra ID Mesh chair configuration details
  • ID Mesh
  • Chair type: office swivel chair
  • Backrest: ID Mesh
  • Colours and materials
  • - Cover material: seat and backrest Silk Mesh
  • - Colour of back cover: dim grey/ like frame colour
  • - Colour of seat cover: dim grey
  • - Frame colour: soft grey
  • Armrests: 2D armrests
  • Base: five-star base, polished aluminium
  • Base on: castors hard, braked for carpet
  • Ergonomics
  • Seat and seat depth adjustment: seat with seat depth adjustment
  • Forward tilt: with forward tilt

The most important aspect of the desk/chair setup for me are the arm rests. I align them with the desk height so that I can place my arms at a 90 degree angle, eliminating strain.

Peripherals

Note: all of my peripherals are Plug & Play under Linux and generally work with standard drivers across Windows, macOS and Linux.

Monitor: Dell 8K4K monitor (UP3218K)

The most important peripheral of a computer is the monitor: you stare at it all the time. Even when you’re not using your keyboard or mouse, you’re still looking at your monitor.

Ever since I first used a MacBook Pro with Retina display back in 2013, I’ve been madly in love with hi-DPI displays, and have gradually replaced all displays in my day-to-day with hi-DPI displays.

My current monitor is the Dell UP3218K, an 8K4K monitor (blog post).

Dell introduced the UP3218K in January 2017. It is the world’s first available 8K monitor, meaning it has a resolution of 7680x4320 pixels at a refresh rate of 60 Hz. The display’s dimensions are 698.1mm by 392.7mm (80cm diagonal, or 31.5 inches), meaning the display shows 280 dpi.

I run it in 300% scaling mode (Xft.dpi: 288), resulting in incredibly crisp text.

Years ago, I used multiple monitors (sometimes 3, usually 2). I stopped doing that in 2011/2012, when I lived in Dublin for half a year and decided to get only one external monitor for practical and cost reasons.

I found that using only one monitor allows me to focus more on what I’m doing, and I don’t miss anything about a multi-monitor setup.

Keyboard: Kinesis advantage keyboard

Kinesis advantage keyboard

The Kinesis is my preferred commercially available ergonomic keyboard. I like its matrix layout, ergonomic key bowls, thumb pads and split hands.

I find typing on it much more comfortable than regular keyboards, and I value the Kinesis enough to usually carry one with me when I travel. When I need to use a laptop keyboard for longer periods of time, my hands and arms get tired.

I bought my first one in 2008 for ≈250 EUR, but have since cleaned up and repaired two more Kinesis keyboards that were about to be trashed. Now I have one for home, one for work, and one for traveling (or keyboard development).

Over the years, I have modified my Kinesis keyboards in various ways:

The first modification I did was to put in Cherry MX blue key switches (tactile and audible), replacing the default Cherry MX browns. I like the quick feedback of the blues better, possibly because I was used to them from my previous keyboards. Without tons of patience and good equipment, it’s virtually impossible to unsolder the key switches, so I reached out to Kinesis, and they agreed to send me unpopulated PCBs into which I could solder my preferred key switches! Thank you, Kinesis.

I later replaced the keyboard controller to address a stuck modifier bug. The PCB I made for this remains popular in the Kinesis modification community to this day.

In 2018, I got interested in keyboard input latency and developed kinX, a new version of my replacement keyboard controller. With this controller, the keyboard has an input latency of merely 0.225ms in the worst case.

Aside from the keyboard hardware itself, I’m using the NEO Ergonomically Optimized keyboard layout. It’s optimized for German, English, Programming and Math, in that order. Especially its upper layers are really useful: hover over “Ebene 3” to see.

I used to remap keys in hardware, but that doesn’t cover the upper layers, so nowadays I prefer just enabling the NEO layout included in operating systems.

Pointing device: Logitech MX Ergo

During my student years (2008 to 2013), I carried a ThinkPad X200 and used its TrackPoint (“red dot”) in combination with trying to use lots of keyboard shortcuts.

The concept of relative inputs for mouse movement made sense to me, so I switched from a mouse to a trackball on the desktop, specifically the Logitech Trackball M570.

I was using the M570 for many years, but have switched to the Logitech MX Ergo a few months ago. It is more comfortable to me, so I replaced all 3 trackballs (home, office, travel) with the MX Ergo.

In terms of precision, a trackball will not be as good as a mouse can be. To me, it more than makes up for the fact by reducing the strain on my hands and wrists.

For comparison: a few years ago, I was playing a shooter with a regular mouse for one evening (mostly due to nostalgia), and I could feel pain from that for weeks afterwards.

Microphone: RØDE Podcaster

To record screencasts for the i3 window manager with decent audio, I bought a RØDE Podcaster USB Broadcast Mic in 2012 and have been using it ever since.

The big plus is that the setup couldn’t be easier: you connect it via USB, and it is Plug & Play on Linux. This is much easier than getting a working setup with XLR audio gear.

The audio quality is good: much better than headsets or cheap mics, but probably not quite as good as a more expensive studio mic. For my usage, this is fine: I don’t record radio broadcasts regularly, so I don’t need the absolutely highest quality, and for video conferences or the occasional podcast, the RØDE Podcaster is superb.

Webcam: Logitech C920

In the past, I have upgraded my webcam every so often because higher resolutions at higher frame rates became available for a reasonably low price.

I’m currently using the Logitech HD Pro Webcam C920, and I’m pretty happy with it. The picture quality is good, the device is Plug & Play under Linux and the picture quality is good out of the box. No fumbling with UVC parameters or drivers required :-)

Note: to capture at 30 fps at the highest resolution, you may need to specify the pixel format

Headphones: Sony WH-1000XM3

At work, I have been using the Bose QuietComfort 15 Noise Cancelling headphones for many years, as they were considered the gold standard for noise cancelling headphones.

I decided to do some research and give bluetooth headphones a try, in the hope that the technology has matured enough.

I went with the Sony WH-1000XM3 bluetooth headphones, and am overall quite happy with them. The lack of a cable is very convenient indeed, and the audio quality and noise cancellation are both superb. A single charge lasts me for multiple days.

Switching devices is a bit cumbersome: when I have the headphones connected to my phone and want to switch to my computer, I need to explicitly disconnect on my phone, then explicitly connect on my computer. I guess this is just how bluetooth works.

One issue I ran into is that when the headphones re-connected to my computer, they would not select the high-quality audio profile until you explicitly disconnect and re-connect again. This was fixed in BlueZ 5.51, so make sure you run at least that version.

USB memory stick: Sandisk Extreme PRO SSD USB 3.1

USB memory sticks are useful for all sorts of tasks, but I mostly use them to boot Linux distributions on my laptop or computer, for development, recovery, updates, etc.

A year ago, I was annoyed by my USB memory sticks being slow, and I found the Sandisk Extreme PRO SSD USB 3.1 which is essentially a little SSD in USB memory stick form factor. It is spec'd at ≈400 MB/s read and write speed, and I do reach about ≈350 MB/s in practice, which is a welcome upgrade from the < 10 MB/s my previous sticks did.

A quick USB memory stick lowers the hurdle for testing distri images on real hardware.

Audio: teufel sound system

My computer is connected to a Teufel Motiv 2 stereo sound system I bought in 2009.

The audio quality is superb, and when I tried to replace them with the Q Acoustics 3020 Speakers (Pair) I ended up selling the Q Acoustics and going back to the Teufel. Maybe I’m just very used to its sound at this point :-)

Physical paper notebook for sketches

I also keep a paper notebook on my desk, but don’t use it a lot. It is good to have it for ordering my thoughts when the subject at hand is more visual rather than textual. For example, my analysis of the TurboPFor integer compression scheme started out on a bunch of notebook pages.

I don’t get much out of hand writing into a notebook (e.g. for task lists), so I tend to do that in Emacs Org mode files instead (1 per project). I’m only a very light Org mode user.

Laptop: TBD

I’m writing a separate article about my current laptop and will reference the post here once published.

I will say that I mostly use laptops for traveling (to conferences or events) these days, and there is not much travel happening right now due to COVID-19.

Having a separate computer is handy for some debugging activities, e.g. single-stepping X11 applications in a debugger, which needs to be done via SSH.

Internet router and WiFi: router7 and UniFi AP HD

Mostly for fun, I decided to write router7, a highly reliabile, automatically updating internet router entirely in Go, primarily targeting the fiber7 internet service.

While the router could go underneath my desk, I currently keep it on top of my desk. Originally, I placed it in reach to lower the hurdle for debugging, but after the initial development phase, I never had to physically power cycle it.

These days, I only keep it on top of my desk because I like the physical reminder of what I accomplished :-)

For WiFi, I use a UniFi AP HD access point from Ubiquiti. My apartment is small enough that this single access point covers all corners with great WiFi. I’m configuring the access point with the mobile app so that I don’t need to run the controller app somewhere.

In general, I try to connect most devices via ethernet to remove WiFi-related issues from the picture entirely, and reduce load on the WiFi.

Switching peripherals between home and work computer

Like many, I am currently working from home due to COVID-19.

Because I only have space for one 32" monitor and peripherals on my desk, I decided to share them between my personal computer and my work computer.

To make this easy, I got an active Anker 10-port USB3 hub and two USB 3 cables for it: one connected to my personal computer, one to my work computer. Whenever I need to switch, I just re-plug the one cable.

Software setup

Linux

I have been using Linux as my primary operating system since 2005. The first Linux distribution that I installed in 2005 was Ubuntu-based. Later, I switched to Gentoo, then to Debian, which I used and contributed to until quitting the project in March 2019.

I had briefly tried Fedora before, and decided to give Arch Linux a shot now, so that’s what I’m running on my desktop computer right now. My servers remain on Flatcar Container Linux (the successor to CoreOS) or Debian, depending on their purpose.

For me, all Linux package managers are too slow, which is why I started distri: a Linux distribution to research fast package management. I’m testing distri on my laptop, and I’m using distri for a number of development tasks. I don’t want to run it on my desktop computer, though, because of its experimental nature.

Window Manager: i3

It won’t be a surprise that I am using the i3 tiling window manager, which I created in 2009 and still maintain.

My i3 configuration file is pretty close to the i3 default config, with only two major modifications: I use workspace_layout stacked and usually arrange two stacked containers next to each other on every workspace. Also, I configured a volume mode which allows for easily changing the default sink’s volume.

One way in which my usage might be a little unusual is that I always have at least 10 workspaces open.

Go

Over time, I have moved all new development work to Go, which is by far my favorite programming language. See the article for details, but in summary, Go’s values align well with my own: the tooling is quick and high-quality, the language well thought-out and operating at roughly my preferred level of abstraction vs. clarity.

Here is a quick description of a few notable Go projects I started:

Debian Code Search is a regular expression source code search engine covering all software available in Debian.

RobustIRC is an IRC network without netsplits, based on the Raft consensus algorithm.

gokrazy is a pure-Go userland for your Raspberry Pi 3 appliances. It allows you to overwrite an SD card with a Linux kernel, Raspberry Pi firmware and Go programs of your chosing with just one command.

router7 is a pure-Go small home internet router.

debiman generates a static manpage HTML repository out of a Debian archive and powers manpages.debian.org.

The distri research linux distribution project was started in 2019 to research whether a few architectural changes could enable drastically faster package management. While the package managers in common Linux distributions (e.g. apt, dnf, …) top out at data rates of only a few MB/s, distri effortlessly saturates 1 Gbit, 10 Gbit and even 40 Gbit connections, resulting in superior installation and update speeds.

Editor: Emacs

In my social circle, everyone used Vim, so that’s what I learnt. I used it for many years, but eventually gave Emacs a shot so that I could try the best notmuch frontend.

Emacs didn’t immediately click, and I haven’t used notmuch in many years, but it got me curious enough that I tried getting into the habit of using Emacs a few years ago, and now I prefer it over Vim and other editors.

Here is a non-exhaustive list of things I like about Emacs:

  1. Emacs is not a modal editor. You don’t need to switch into insert mode before you can modify the text. This might sound like a small thing, but I feel more of a direct connection to the text this way.

  2. I like Emacs’s built-in buffer management. I could never get used to using multiple tabs or otherwise arranging my Vim editor window, but with Emacs, juggling multiple things at the same time feels very natural.
    I make heavy use of Emacs’s compile mode (similar to Vim’s quick fix window): I will compile not only programs, but also config files (e.g. M-x compile i3 reload) or grep commands, allowing me to go through matches via M-g M-n.

  3. The Magit package is by far my most favorite Git user interface. Staging individual lines or words comes very naturally, and many operations are much quicker to accomplish compared to using Git in a terminal.

  4. The eglot package is a good LSP client, making available tons of powerful cross-referencing and refactoring features.

  5. The possible customization is impressive, including the development experience: Emacs’s built-in help system is really good, and allows jumping to the definition of variables or functions out of the box. Emacs is the only place in my day-to-day where I get a little glimpse into what it must have been like to use a Lisp machine

Of course, not everything is great about Emacs. Here are a few annoyances:

  1. The Emacs default configuration is very old, and a number of settings need to be changed to make it more modern. I have been tweaking my Emacs config since 2012 and still feel like I’m barely scratching the surface. Many beginners find their way into Emacs by using a pre-configured version of it such as Doom Emacs or Spacemacs.

  2. Even after going through great lengths to keep startup fast, Emacs definitely starts much more slowly than e.g. Vim. This makes it not a great fit for trivial editing tasks, such as commenting out a line of configuration on a server via SSH.

For consistency, I eventually switched my shell and readline config from vi key bindings to the default Emacs key bindings. This turned out to be a great move: the Emacs key bindings are generally better tested and more closely resemble the behavior of the editor. With vi key bindings, sooner or later I always ran into frustrating feature gaps (e.g. zsh didn’t support the delete-until-next-x-character Vim command) or similar.

Hardware setup: desktop computer

I should probably publish a separate blog post with PC hardware recommendation, so let me focus on the most important points here only:

I’m using an Intel i9-9900K CPU. I briefly switched to an AMD Ryzen 3900X based on tech news sites declaring it faster. I eventually found out that the Intel i9-9900K actually benchmarks better in browser performance and incremental Go compilation, so I switched back.

To be able to drive the Dell 8K4K monitor, I’m using a nVidia GeForce RTX 2070. I don’t care for its 3D performance, but more video RAM and memory bandwidth make a noticeable difference in how many Chrome tabs I can work with.

To avoid running out of memory, I usually max out memory based on mainboard support and what is priced reasonably. Currently, I’m using 64 GB of Corsair RAM.

For storage, I currently use a Phison Force MP600 PCIe 4 NVMe disk, back from when I tried the Ryzen 3900X. When I’m not trying out PCIe 4, I usually go with the latest Samsung Consumer SSD PRO, e.g. the Samsung SSD 970 PRO. Having a lot of bandwidth and IOPS available is great in general, but especially valuable when e.g. re-generating all manpages or compiling a new distri version from scratch.

I’m a fan of Fractal Design’s Define case series (e.g. the Define R6) and have been using them for many years in many different builds. They are great to work with: no sharp edges, convenient screws and mechanisms, and they result in a quiet computer.

For fans, my choice is Noctua. Specifically, their NH-U14S makes for a great CPU fan, and their NF-A12x25 are great case fans. They cool well and are super quiet!

Network storage

For redundancy, I am backing up my computers to 2 separate network storage devices.

My devices are built from PC Hardware and run Flatcar Linux (previously CoreOS) for automated updates. I put in one hard disk per device for maximum redundancy: any hardware component can fail and I can just use the other device.

The software setup is intentionally kept very simple: I use rsync (with hardlinks) over SSH for backups, and serve files using Samba. That way, backups are just files, immediately available, and accessible from another computer if everything else fails.

Conclusion

I hope this was interesting! If you have any detail questions, feel free to reach out via email or twitter.

If you’re looking for more product recommendations (tech or otherwise), one of my favorite places is the wirecutter.

at 2020-05-23 13:22

2020-05-16

sECuREs website

a new distri linux (fast package management) release

I just released a new version of distri.

The focus of this release lies on:

  • a better developer experience, allowing users to debug any installed package without extra setup steps

  • performance improvements in all areas (starting programs, building distri packages, generating distri images)

  • better tooling for keeping track of upstream versions

See the release notes for more details.

The distri research linux distribution project was started in 2019 to research whether a few architectural changes could enable drastically faster package management.

While the package managers in common Linux distributions (e.g. apt, dnf, …) top out at data rates of only a few MB/s, distri effortlessly saturates 1 Gbit, 10 Gbit and even 40 Gbit connections, resulting in fast installation and update speeds.

at 2020-05-16 07:13