Record and vPlay: Debugging Container-App-Crashes with "Partial Checkpoints"

Session information has not yet been published for this event.

*

One Line Summary

Loosely based on Dinesh Subhraveti's PhD thesis, the vPlay system enables the minimal runtime state of the container to be captured such that when restored, application would retrace its execution for a specified time interval. The key observation is that during the last moments of an application crash, where the root cause typically lies, the application only accesses a small subset of its address space and only those pages need to be saved. The technique, dubbed partial checkpointing, is combined with logging to be used for debugging. Because all interactions of the application with the kernel are logged, the execution can be natively replayed even on BSD or Windows.

Abstract

Loosely based on Dinesh Subhraveti’s PhD thesis, the vPlay system enables the minimal runtime state of the container to be captured such that when restored, application would retrace its execution for a specified time interval. The key observation is that during the last moments of a crash, where the root cause typically lies, the application only accesses a small subset of its address space and only those pages need to be saved. The technique, dubbed partial checkpointing, is combined with logging to be used for debugging. Because all interactions of the application with the kernel are logged, the execution can be natively replayed even on BSD or Windows.

The details of the kernel and user space implementations of the mechanism along with integration with GDB can be found in Dinesh’s thesis.

Presentation Materials

slides

Speaker

  • Subhraveti

    Biography

    Dinesh is the CTO and cofounder at AppOrbit. He developed the core principles that underlie the container abstraction and showed for the first time that enterprise applications could be live-migrated using that abstraction as a part of his Ph.D. work published in OSDI’02. He drove the development of the industry’s first container live-migration product at Meiosys, the company behind LXC that IBM acquired in 2005. Dinesh authored numerous research papers on containers, checkpoint-restart and record-replay. He holds a PhD in computer science from Columbia University.