Managing the impact of growing CPU register state on the user ABI

Session information has not yet been published for this event.

*
Refereed Presentation
Scheduled: Friday, September 15, 2017 from 2:50 – 3:35pm in Platinum D

One Line Summary

Extending the user/kernel ABI to cope with increasingly large and numerous CPU registers turns out to be non-trivial, yet CPU architectures are already evolving to require it. How do we minimise ABI breakage?

Abstract

There is an ongoing trend to increase the number and size of user-accessible CPU registers as architectures evolve. ARM SVE 1 and Intel AVX-512 2 are recent examples.

Many parts of the user/kernel interface that describe CPU register state are straightforward to extend, but there are some pain points — in particular, the signal frame and related APIs (sigaltstack, threading APIs, ucontext etc.)

To size stacks correctly, userspace needs to know how much space the kernel will use to push the signal frame when delivering a signal. This becomes particularly important in many-threaded environments where allocating a giant slab of VA space for each thread stack may not be acceptable, or when using sigaltstack or the ucontext API.

struct sigcontext (which defines the signal frame layout) and struct ucontext (which typically embeds sigcontext) are also ABI and not trivially extensible.

There is a traditional POSIX answer (or two) for the size of the signal frame: the MINSIGSTKSZ and SIGSTKSZ #defines 3. Because they are #defines, these get baked into all userspace programs at compile time, and form part of the ABI.

Initial arch guesses for MINSIGSTKSZ seemed generous initially, but can turn out to be unlucky: for example, SVE massively grows the SIMD vector registers, but leaves the exact number of vector lanes as a hardware implementation choice: this means that saving the SIMD vector registers in the signal frame may require a lot more space in the maximum case than in the typical case. As it happens, the arm64 signal frame overflows well before the maximum SVE vector size is reached. Worse, the architecture also leaves open the possibility of growing the maximum vector size further in the future.

MINSIGSTKSZ could be increased but this is still an ABI break, and it is unclear what it should be increased to in this case. This also doesn’t solve the underlying extensibility problem.

This presentation will outline the problem, and I will describe the current approach proposed for SVE and arm64 to mitigate the ABI impact of extending the signal frame. However, this approach may miss issues that need to be taken into account in order to be usable across multiple architectures.

Ideally, the solution should not be wholly arch-specific. If we can come up with a common approach, this also gives more force to adoption of a better interface in POSIX/libc land.


1 ARM Scalable Vector Extension

https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture

2 Intel AVX-512

https://software.intel.com/en-us/blogs/2013/avx-512-instructions

3 sigaltstack(2)

Tags

kernel, signal, ABI, vector, simd

Presentation Materials

slides

Speaker

  • Biography

    Dave Martin is a Staff Engineer at ARM where he has worked on low-level hacking on various FOSS projects since 2009, from contributing to Ubuntu’s migration to modern ARM platforms, through working via Linaro since its inception on various ARM userspace plumbing, to the kernel, where he now contributes as part of ARM’s kernel team.

    For the last couple of years he has been enabling new corners of the arm64 architecture including adding upstream support for ARM’s next generation of vector extensions (which turn out to be far from boring in their ABI implications).