The Scheduler microconference focuses on deciding what process gets to run when and for how long. With different topologies and workloads, it is no easy task to give the user the best experience possible. Schedulers are one of the most discussed topics at the Linux Kernel Mailing List, but many of these topics need further discussion in a conference format. Indeed, the scheduler microconference is responsible for many topics to make progress.
Title: Scheduler Microconference
The scheduler is an important functionality of the Linux kernel, deciding what process gets to run when and for how long. With different topologies and workloads, it is no easy task to give the user the best experience possible. Schedulers are one of the most discussed topics at the Linux Kernel Mailing List, but many of these topics need further discussion...
The Linux scheduler shuffles tasks around on the various CPUs all the time, as mandated by the implementation of a combination of policies and heuristics. In general, this works well for many different workloads and lets us achieve a more than acceptable compromise among often conflicting goals, such as maximum throughput, minimum latency, reasonable energy consumption, etc. Furthermore, for...
On the CFS wakeup path, wake_wide() doesn't always behave itself very well in interrupt-heavy workloads. We have systems configured with static IRQ bindings because IRQs are served faster on certain CPUs due to hardware topology. We then noticed on these systems that wakeups kept pulling tasks to the socket serving network IRQs while leaving the other socket nearly idle on a read-only workload...
AMD and ARM server architectures further complicate the issue with wake_wide() overeager pulling (see other abstract).
An LLC domain can span a whole socket on an Intel server but are significantly smaller on AMD ZEN due to its CCXs. For example, on ZEN 2, each CCX has only 4 cores. When binding network IRQs to such a CCX, we can consistently reproduce a scenario in which over 50 iperf...
Several proposals have been tried to change the policy of the wake up path regarding the selection of an idle CPU in the scheduler:
- Consider new topology levels
- Speedup and optimize idle cores and/or CPUs selection
- Better estimate how much effort worth spending to look for an idle CPUs
- and more others
This talk will summarize the current ongoing proposals and discuss the best way...
CPU-intensive kthreads aren't generally accounted in the CPU controller, so they escape weight and bandwidth settings when they do work on behalf of a task group.
This is a problem in at least three places in the kernel. Padata multithreaded jobs ([link1], [link2], [link3]) may be started by a user task, so helper threads should be bound by the task's task group controls. Async...
Sugov implements a rather simplistic concept of boosting I/O-bound
tasks, through tracking I/O wakeups reported on each CPU and adjusting a
synthetic boost value to potentially influence upcoming frequency changes.
The actual boost value depends on a number of different conditions, like
timings of the task wake-ups and CPU frequency update requests or the
One of the most significant metrics for good user experience on a mobile
device is how quickly the system can react to load changes.
Util_est is used in mainline to create a more stable signal for per-task
demand, which is the maximum of the task util_est and PELT utilization
(known as the task utilization).
In case PELT utilization becomes higher than util_est, the behaviour of