Decoding Those Inscrutable RCU CPU Stall Warnings

This proposal has been rejected.

*

One Line Summary

This presentation will help you determine why stall warnings happen and what you can do about them.

Abstract

You are minding your own business when suddenly one of your system splats out something like “INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies)”. Whatever does this RCU CPU stall warning mean and what can you do about it? That is, other than simply beating your head against Documentation/RCU/stallwarn.txt?

This talk will look at a few representative RCU CPU stall warning messages and show how they can be decoded into real information that can help you find otherwise silent hangs the easy way. Or at least an easier way!

Tags

kernel, bugs, concurrency, hang detection

Speaker

  • Italy2010a

    Paul McKenney

    IBM Linux Technology Center

    Biography

    Paul E. McKenney has been coding for almost four decades, more than half of that on parallel hardware, where his work has earned him a reputation among some as a flaming heretic. Over the past decade, Paul has been an IBM Distinguished Engineer at the IBM Linux Technology Center. Paul maintains the RCU implementation within the Linux kernel, where the variety of workloads present highly entertaining performance, scalability, real-time response, and energy-efficiency challenges. Prior to that, he worked on the DYNIX/ptx kernel at Sequent, and prior to that on packet-radio and Internet protocols (but long before it was polite to mention Internet at cocktail parties), system administration, business applications, and real-time systems. His hobbies include what passes for running at his age along with the usual house-wife-and-kids habit.