http://summit.ubuntu.com/lpc-2012/ Nautilus 5

Wednesday, 09:50 - 10:35 PDT
Not Attending Unavailable
Room Unavailable
Nautilus 5
Wednesday, 10:45 - 11:30 PDT
Not Attending CoDel and Queue Limits ( Networking )
Networking Topics 1. CoDel and FQ CoDel 2. Byte Queue Limits revisited === CoDel and FQ CoDel === Kathleen Nichols & Van Jacobson made a major work in the to help the bufferbloat problem that got some attraction these past years after Jim Gettys and Dave Taht communications. The result of this work is presented in following communication[1]. [1] http://queue.acm.org/detail.cfm?id=2209336 Idea of Codel saying that buffers or large Queues are not bad per se (as BufferBloat could imply) Its the way Queues are handled that is critical. AQM are quite complex to deploy because RED (the most known AQM) is hard to tune and has some flaws. CoDel intent is to use the delay of packets in queue as the main input, and not the Queue length (in Bytes or Packets) It is designed to be a "No knob" AQM for routers. Its implementation uses a fairly simple algo intended for silicon integration. Since Kathleen and Van ideas were very close of the various ideas I had last year to improve linux AQM, I co-implemented Codel for linux (with Dave Taht). I also implemented fq_codel, which is SFQRED replacement, using Fair Queueing and CoDel managed per flow queue. I'll present various experimental results. Topic Lead: Eric Dumazet <email address hidden> Eric is currently working for Google. He is a linux networking guy, who did some work in packet schedulers lately: SFQ improvements, and CoDel / fq_codel implementations. === Byte Queue Limits revisited === Byte queue limits (BQL) is an algorithm to manage sizes of queues in network cards based on bytes, not on packets. The algorithm tries to estimate how many bytes the card is able to transmit, sizes the queue accordingly and accommodates to changing load. Properly sized (shorter) queues push queuing to upper layers of the stack, the queuing disciplines (qdisc), which reduces the time between a packet is scheduled for transmission and when it hits the wire. This reduces latency and allows better scheduling decisions in software. BQL is here for a year and it seems to work well given the lack of major complains. In this talk we are going to present experimental data on how BQL actually behaves and what are the effects of BQL buffer management for the rest of the stack. We are going to show that BQL does not need any knobs to selects a good size of the queues. We are also going to discuss and explain some limitations of the algorithm and some corner cases of its deployment due to its dependency on outer events that pace its execution. Topic Lead: Tomas Hruby Tomas is a PhD candidate at the Free University in Amsterdam in the Minix group of prof. Andy Tanenbaum and is exploring how to take advantage of multicore processors for designing reliable systems. He has been working on intrusion detection, filesystems and L4 microkernel. He is currently an intern in the Linux networking team at Google.

Participants:
attending therbert (Tom Herbert)

Tracks:
  • Networking
Nautilus 5
Wednesday, 11:40 - 12:25 PDT
Not Attending Data Direct I/O and Ethernet AVB ( Networking )
Networking Topics: 1) Data Direct I/O 2) Ethernet Audio/Video Bridging === Data Direct I/O Significantly Boosts Networking Performance and Reduces Power === This presentation calls out the new Data Direct I/O (DDIO) platform technology that enables I/O data transfers that require far fewer trips to memory (nearly zero in the most optimal scenarios). In doing so, DDIO significantly boosts performance (higher throughput, lower CPU usage, and lower latency), and lowers power consumption. The updated architecture of the Intel Xeon processor to remove the inefficiencies of the classic model by enabling direct communication between Ethernet controllers and adapters and host processor cache. Eliminating the frequent visits to main memory present in the classic model reduces power consumption, provides greater I/O bandwidth scalability, and lowers latency. By avoiding the multiple reads from and writes to system memory, DDIO reduces latency, increases system I/O bandwidth, and reduces power consumption. Intel DDIO is enabled by default on all Intel Xeon processor E5 based servers and workstation platforms. This presentation will explain the technology in detail as well as how it currently gets used. Performance numbers will be included from our Ethernet controllers which will clearly show the benefits of the technology. All performance gains will be examined and explained including the power reduction while increasing the bandwidth as well as reducing latency. === Ethernet Audio/Video Bridging (AVB) - a Proof-of-Concept === Using our latest gigabit Ethernet controller we designed and implemented a Proof-of-Concept Audio Video Bridging device using the IEEE 802.1Qav standard. The project was implemented using a modified Linux igb driver with a user space component to pass the AVB frames to the controller while in addition maintaining normal network connection. This presentation will go through the details of the project, explain the challenges and have a demo of the working implementation at the end. AVB is now being used to pass audio and video to many different types of A/V devices using Ethernet cables instead of having to run large heavy analog A/V cables to the devices. So not only is all the analog cabling gone but the performance is also far superior with the ease of controlling all the audio and video from a single work-station. Topic Lead: John Ronciak John is a SW Architect working for Intel in the LAN Access Division (LAD). John has 30 years experience writing devices drivers for various operating system and is currently one of the leads in the Open Source driver group responsible for six Linux kernel drivers.

Participants:
attending therbert (Tom Herbert)

Tracks:
  • Networking
Nautilus 5
Wednesday, 14:00 - 14:45 PDT
Not Attending Concurrency Kit: Towards accessible non-blocking technology for C ( Scaling )
Despite more than 20 years of active research and development, non-blocking technologies remain inaccessible to many students, engineers and open-source projects. This is especially true in the context of an unmanaged language such as C despite its popularity in highly complex concurrent systems. Even in light of attractive performance properties, small to medium-sized corporations are extremely hesitant in adopting patent-free technology due to the technology lock-down associated with the various interfaces of existing concurrency libraries. To top it off, when introducing engineers to this area, many are overwhelmed by the literature and the sparsity of performance data. This topic will walk the audience through the story of the struggles Samy and his peers have faced in the last couple of years in developing sufficient working knowledge to (efficiently) leverage existing non-blocking data structures as well as design, implement and verify new algorithms for use by mission-critical systems. It will highlight the holes faced in existing open-source projects tackling the concurrency problem for the C programming language and the literature associated with much of existing technology. The culmination of frustrations lead to the development of Concurrency Kit, a library designed to aid in the design and implementation of high performance concurrent systems. It is designed to minimize dependencies on operating system-specific interfaces and most of the interface relies only on a strict subset of the standard library and more popular compiler extensions. Topic Lead: Samy Bahra <email address hidden> Samy is an engineer focused on developing a leading, real-time low latency online advertising platform. Before moving to New York, Samy played a crucial role on the engineering team behind the leading high-performance messaging platform. Prior to that, he was an active member of a high-performance computing laboratory and was primarily involved with Unified Parallel C and the performance modeling and analysis of shared-memory multi-processor systems. He has been involved with several open-source projects.

Participants:
attending mathieu-desnoyers (Mathieu Desnoyers)
attending paulmck (Paul McKenney)

Tracks:
  • Scaling
Nautilus 5
Wednesday, 14:50 - 15:35 PDT
Not Attending big.LITTLE && Tegra30 ( Scheduler )
=== Task placement for asymmetric cores === Traditional SMP scheduling is basically aiming for equal load distribution across all CPUs. To take full advantage of the big.LITTLE MP power/performance heterogeneity, task affinity is crucial. I have experimented with modifications to the Linux scheduler, which attempt to minimizing power consumption by selecting task affinity appropriately for each task. Topic Lead: Morten Rasmussen === Scheduling and the big.LITTLE Architecture === ARM's big.LITTLE architecture is an example of asymmetric multiprocessing where all CPUs are instruction-set compatible, but where different CPUs have very different performance and energy-efficiency characteristics. In the case of big.LITTLE, the big CPUs are Cortex-A15 CPUs with deep pipelines and numerous functional units, providing maximal performance. In contrast, the LITTLE CPUs are Cortex-A7 with short pipelines and few functional units, which optimizes for energy efficiency. Linaro is working on two methods of supporting big.LITTLE systems. One way to configure big.LITTLE systems is in an MP configuration, where both the big and LITTLE CPUs are present. Traditionally, most SMP operating systems have assumed that all CPUs are identical, but this is emphatically not the case for big.LITTLE. Therefore, changes for big.LITTLE are required. This talk will give an overview of the progress towards the goal of big.LITTLE support in the Linux plumbing. Topic Lead: Paul E. McKenney === cpuquiet: Dynamic CPU core management === NVIDIA Tegra30 has CPU clusters with different abilities. There is a fast cluster with 4 cortex A9 cores and a low power cluster with a single cortex A9 core. Only 1 of the clusters can be active at a time. This means the number of available cores for the kernel changes at runtime. Currently we use CPU hotplug to make cores unavailable so we can initiate a switch to the low power cluster. This has a number of problems such as long latencies in switching between the clusters. Therefor a new mechanism where CPUs would not be available but still not completely removed from the system, would be useful. CPUs in this quiet state would not be running any userspace or kernelspace code, until they are explicitly made available again. The kernel datastructures associated with each CPU would be preserved however so a transitions can be a low latency operations. The policy can be encapsulated in a governor, like the cpufreq and cpuidle governors we already have. Topic Lead: Peter De Schrijver Peter is an NVIDIA Tegra Linux kernel engineer, Debian developer, previously working on power management in maemo for Nokia.

Participants:
attending apm (Antti P Miettinen)
attending paulmck (Paul McKenney)
attending srwarren (Stephen Warren)
attending vincent-guittot (Vincent Guittot)

Tracks:
  • Scheduler
Nautilus 5
Wednesday, 15:45 - 16:30 PDT
Not Attending big.LITTLE && Tegra30 ( Scheduler )
=== Task placement for asymmetric cores === Traditional SMP scheduling is basically aiming for equal load distribution across all CPUs. To take full advantage of the big.LITTLE MP power/performance heterogeneity, task affinity is crucial. I have experimented with modifications to the Linux scheduler, which attempt to minimizing power consumption by selecting task affinity appropriately for each task. Topic Lead: Morten Rasmussen === Scheduling and the big.LITTLE Architecture === ARM's big.LITTLE architecture is an example of asymmetric multiprocessing where all CPUs are instruction-set compatible, but where different CPUs have very different performance and energy-efficiency characteristics. In the case of big.LITTLE, the big CPUs are Cortex-A15 CPUs with deep pipelines and numerous functional units, providing maximal performance. In contrast, the LITTLE CPUs are Cortex-A7 with short pipelines and few functional units, which optimizes for energy efficiency. Linaro is working on two methods of supporting big.LITTLE systems. One way to configure big.LITTLE systems is in an MP configuration, where both the big and LITTLE CPUs are present. Traditionally, most SMP operating systems have assumed that all CPUs are identical, but this is emphatically not the case for big.LITTLE. Therefore, changes for big.LITTLE are required. This talk will give an overview of the progress towards the goal of big.LITTLE support in the Linux plumbing. Topic Lead: Paul E. McKenney === cpuquiet: Dynamic CPU core management === NVIDIA Tegra30 has CPU clusters with different abilities. There is a fast cluster with 4 cortex A9 cores and a low power cluster with a single cortex A9 core. Only 1 of the clusters can be active at a time. This means the number of available cores for the kernel changes at runtime. Currently we use CPU hotplug to make cores unavailable so we can initiate a switch to the low power cluster. This has a number of problems such as long latencies in switching between the clusters. Therefor a new mechanism where CPUs would not be available but still not completely removed from the system, would be useful. CPUs in this quiet state would not be running any userspace or kernelspace code, until they are explicitly made available again. The kernel datastructures associated with each CPU would be preserved however so a transitions can be a low latency operations. The policy can be encapsulated in a governor, like the cpufreq and cpuidle governors we already have. Topic Lead: Peter De Schrijver Peter is an NVIDIA Tegra Linux kernel engineer, Debian developer, previously working on power management in maemo for Nokia.

Participants:
attending apm (Antti P Miettinen)
attending paulmck (Paul McKenney)
attending srwarren (Stephen Warren)
attending vincent-guittot (Vincent Guittot)

Tracks:
  • Scheduler
Nautilus 5
Thursday, 08:30 - 09:25 PDT
Not Attending Breakfast
Nautilus 5
Thursday, 09:30 - 10:15 PDT
Not Attending UEFI Basics Tutorial ( Core OS )
Covers: - UEFI basics on what appears in UEFI 2.3.1 spec'd systems with UEFI operating systems - requirements of both the OS and firmware for UEFI 2.3.1 secure boot - PXE network boot services(IPv6 andIPV4). Abstract will be available soon Instructor: Harry Hsiung (Intel)

Participants:
attending srwarren (Stephen Warren)

Tracks:
  • Core OS
Nautilus 5
Thursday, 10:25 - 11:10 PDT
Not Attending UEFI Basics Tutorial ( Core OS )
Covers: - UEFI basics on what appears in UEFI 2.3.1 spec'd systems with UEFI operating systems - requirements of both the OS and firmware for UEFI 2.3.1 secure boot - PXE network boot services(IPv6 andIPV4). Abstract will be available soon Instructor: Harry Hsiung (Intel)

Participants:
attending srwarren (Stephen Warren)

Tracks:
  • Core OS
Nautilus 5
Thursday, 11:20 - 12:05 PDT
Not Attending UEFI Basics Tutorial ( Core OS )
Covers: - UEFI basics on what appears in UEFI 2.3.1 spec'd systems with UEFI operating systems - requirements of both the OS and firmware for UEFI 2.3.1 secure boot - PXE network boot services(IPv6 andIPV4). Abstract will be available soon Instructor: Harry Hsiung (Intel)

Participants:
attending srwarren (Stephen Warren)

Tracks:
  • Core OS
Nautilus 5
Thursday, 13:30 - 14:15 PDT
Not Attending LLVM and Clang: Advancing Compiler Technology ( LLVM )
This talk introduces LLVM, giving a brief sense for its library based design. It then dives into Clang to describe the end-user benefits of LLVM compiler technology and the status of Clang Static Analyzer, LLDB, libc++ and the LLVM MC projects. Topic Leader: Chris Lattner Chris Lattner is best known as the primary author of the LLVM project and related projects, such as the clang compiler. He currently works at Apple Inc. as the Director of Low Level Tools and chief architect of the Compiler Group.

Tracks:
  • LLVM
Nautilus 5
Thursday, 14:25 - 15:10 PDT
Not Attending LLVM Toolchain - Update and State of Building Linux with LLVM ( LLVM )
LLVM is a new toolchain that is becoming increasingly common in Linux environments and is already included in millions of Linux devices. Today, this is primarily as the JIT compiler for Renderscript in Android Ice-cream Sandwich, but its use is rapidly expanding into other areas of a Linux systems. This session will provide an update on the status of LLVM, some of the new uses that LLVM will be put to, and the state of building Linux with LLVM. Topic Lead: Mark Charlebois Mark Charlebois is Director of Open Source Software Strategy at QuIC. In his 13 years at Qualcomm he has lead diverse technical investigations working for various R&D divisions, and has worked on Unix-based systems since 1988 and embedded systems since 1990. Currently, Mark is a Linux evangelist, responsible for helping shape QuIC's Open Source SW strategy, and has been working on building Linux with LLVM. Mark has a bachelor's degree in Systems Design Engineering from the University of Waterloo in Canada and a masters degree in Engineering Science from Simon Fraser University in Canada. Topic Lead: Behan Webster Since graduating with a Bachelors degree in Computer Engineering from the University of Waterloo, Behan has spent the past two decades in diverse tech industries such as telecom, datacom, optical, and automotive. Throughout his career his work has most often involved kernel level programming, drivers, embedded software, board bring-ups, and build systems built on or for Linux (since 1996), and with UNIX before that. Currently Behan is the founder of Converse in Code and an embedded Linux engineer working on the LLVMLinux project as well as being a Trainer for the Linux Foundation. Behan is under the delusion he can fix most things with a “tiny little script”.

Tracks:
  • LLVM
Nautilus 5
Thursday, 15:20 - 16:05 PDT
Not Attending LLVM/Clang x86 Kernel Build ( LLVM )
LLVM as new toolchain is used in an increasing number of projects: RenderScript, Gallium3D, Minix, FreeBSD and others. Also in the Linux ecosystem usage of LLVM and clang is on the rise. This talk provides an overview about the current state and efforts to compile the Linux Kernel with LLVM on x86, issues remaining, and discussion on what challenges and how they can be solved. Topic Lead: Jan-Simon Moeller Jan-Simon Möller is an engineer and consultant familiar with Embedded Linux, Build Systems, the Kernel and he contributes to open source projects. For the Linux Foundation he works as trainer for Embedded Linux and device drivers classes.

Tracks:
  • LLVM
Nautilus 5
Thursday, 16:30 - 17:15 PDT
Not Attending PM Constraints: OMAP ( Constraint Framework )
Power Management Constraints - OMAP 1) OMAP SoC Thermal Containment 2) Power Management in Linux with a coprocessor 3) New Model for System and Devices Latency === OMAP SoC Thermal Containment === The thermal challenge is to design an end-product with high performance while keeping the junction temperature of the IC components used on this product within their limitations and which does not present a thermal discomfort for the user. OMAP4/OMAP5 System on Chips, operating at highest Operating Performance Points (OPP), is a powerful mobile applications processor. However operating at higher voltage and higher frequency in a sustained manner may cause thermal limits to be exceeded, both for silicon and user comfort. We propose extensions on existing frameworks to model per device power constraints, for containment of thermal limitations across major heat sources of a end-product device, e.g. LCD, CPU, charging, etc. The framework shall facilitate the power and thermal management performed by governor and policies, depending on device context and use case knowledge. Topic Lead: Eduardo Valentin Eduardo currently acting as System Software Engineer at Texas Instruments, working on OMAP Linux kernel. He has been involved with embedded Linux for some years already, contributing on products of companies like Texas Instruments, Nokia, Motorola and Samsung. The main areas of interest are (not limited to): power management, real time, performance, scheduling, system and software otimization, and recently, thermal management. === Power Management in Linux with a coprocessor === Shutting down the main processor of an SoC in the idle and standby state results in significant power savings. However, doing so requires the responsibility of reviving the system to be passed onto an independent entity in the system. Texas Instruments has taken the lead in introducing a novel approach for system power management in its AM335x processor family involving a Cortex-M3 to assist the main processor. In the future there will other devices from Texas Instruments and possibly other silicon vendors which adopt this technique. Integrating a co-processor which is not running Linux with the PM framework comes with a new set of challenges. The co-processor needs to interact with the host processor for idle as well as standby power management. How do we communicate with a co-processor without significant overhead in the idle thread? In case the co-processor stops responding what should be recovery mechanism in the PM framework? What should be the mechanism for exporting the core details like the configured wakeup sources to the co-processor? This session will focus on the above mentioned challenges and other issues surrounding the usage of a co-processor for power management in Linux. Topic Lead: Vaibhav Bedia Vaibhav Bedia, Software Systems Engineer, Texas Instruments, works on Linux kernel development for Sitara ARM microprocessors. === New Model for System and Devices Latency === Due to the nature of the new SoC architectures the Power Management needs a new model for the various system latencies. The session discusses: - Concepts of system, devices, wake-up and resume latencies, - Recent changes in the devices framework for the latency, why and how to make it generic, - Links with the other PM QoS frameworks: thermal, cpuidle, - Recent changes in the ARM/OMAP platform code for the system latency, - Problems encountered while modelling and measuring the various latencies, - A proposed model and how to implement it, - Planned changes in the device framework, the platform code and the APIs. This session is oriented towards Linux power management developers. The goal is to agree on a framework implementation and the interfaces within the kernel and with the user space. Topic Lead: Jean Pihet <email address hidden> Jean is working with embedded Linux since many years now, for companies like Texas Instruments, MontaVista, Motorola and Philips. Recently NewOldBits.com has been founded to provide high quality consulting services. The area of work is mainly OMAP Power Management, tracing and profiling tools (perf, ftrace, oprofile...) for recent ARM cores.

Participants:
attending apm (Antti P Miettinen)
attending mark97229 (Mark Gross)
attending srwarren (Stephen Warren)

Tracks:
  • Constraint Framework
Nautilus 5
Thursday, 17:25 - 18:10 PDT
Not Attending ARM Virtualization ( Virtualization )
Virtualization Topics: 1. Xen on ARM Cortex A15 2. Porting KVM to the ARM Architecture === Xen on ARM Cortex A15 === During the last few months of 2011 the Xen Community started an effort to port Xen to ARMv7 with virtualization extensions, using the Cortex A15 processor as reference platform. The new Xen port is exploiting this set of hardware capabilities to run guest VMs in the most efficient way possible while keeping the ARM specific changes to the hypervisor and the Linux kernel to a minimum. Developing the new port we took the chance to remove legacy concepts like PV or HVM guests and only support a single kind of guests that is comparable to "PV on HVM" in the Xen X86 world. This talk will explain the reason behind this and other design choices that we made during the early development process and it will go through the main technical challenges that we had to solve in order to accomplish our goal. Notable examples are the way Linux guests issue hypercalls and receive event channels notifications from Xen. Is there anything that we could have done better? Is the architecture that we lied down in the Linux kernel generic enough to be re-used by other hypervisors? Topic Lead: Stefano Stabellini Stefano is a Senior Software Engineer at Citrix, working on the Open Source Xen Platform team. He has been working on Xen since 2007, focusing on several different projects, spanning from Qemu to the Linux kernel. He currently maintains libxenlight, Xen support in Qemu and PV on HVM in the Linux kernel. Before joining Citrix he was a researcher at the Institute for Human and Machine Cognition, working on mobile ad hoc networks. === Porting KVM to the ARM Architecture === With the introduction of the Virtualization Extensions to the ARM architecture (as implemented in the Cortex A7 and A15 processors), it is possible to implement a hardware assisted hypervisor. The KVM port to the ARM architecture, started by Christoffer Dall (University of Columbia) is an example of such a hypervisor. Our proposal is to describe the current state of the project, explain how the various virtualization extensions (hypervisor mode, second stage translation, virtual interrupt controller, timers) are used, how the KVM implementation on ARM differs from other architectures, and what our plans are for upstreaming the code. Topic Lead: Marc Zyngier <email address hidden> Marc has been toying with the Linux kernel since 1993, and has been involved over time with the RAID subsystem (MD), all kind of ancient architectures (by maintaining the EISA bus), messed with consumer electronics, and now focuses on the ARM architecture.

Participants:
attending amitshah (Amit Shah)
attending eblake (Eric Blake)
attending lpc-virt-lead (LPC Virtualization Lead)
attending marc-zyngier (Marc Zyngier)
attending srwarren (Stephen Warren)
attending stefano-stabellini (Stabe)

Tracks:
  • Virtualization
Nautilus 5
Friday, 09:10 - 09:55 PDT
Not Attending Classification/Shaping && HW Rate Limiting && Open-vswitch ( Networking )
Networking Bufferbloat Topics 1. Linux Traffic Classification and Shaping 2. TC Interface to Hardware Rate Limiting 3. Harmonizing Multiqueue, Vmdq, virtio-net, macvtap with open-vswitch === Linux Traffic classification and Shaping === Linux provides advanced mechanism for traffic classification and shaping. Central to this role is the queuing discipline. Recently we have done work allowing hardware to offload some of these traditionally CPU intensive task and have experimented with mechanisms to improve performance on many-core systems. Here we would like to highlight work such as the queuing scheduler 'mqprio' that have recently been accepted upstream. As well as share results from experimental work running lockless queuing disciplines and classifiers on many-core systems and fat pipes (10Gbps and greater). Topic Lead: Tom Herbert === tc(?) interface to hardware transmit rate limiting === Intel 10 Gigabit hardware (and others) can provide transmit rate limiting. This presentation will discuss development of a new simple qdisc that can either provide all-software transmit rate limiting, or when installed over hardware that supports the capability, can directly configure the hardware's rate limiting. One problem that will need discussion is that the Intel hardware's rate limiting is per-queue. Another option besides a qdisc that could be discussed is direct ethtool control over the rate limiting. Topic Lead: Jesse Brandeburg Jesse is a senior Linux developer in the Intel LAN Access Division (Intel Ethernet). He has been with Intel since 1994, and has worked on the Linux e100, e1000, e1000e, igb, ixgb, ixgbe drivers since 2002. His time is split between solving customer issues, performance tuning Intel's drivers, and working on bleeding edge development for the Linux networking stack. === Harmonizing Multiqueue, Vmdq, virtio-net, macvtap with open-vswitch === Multiqueue virtio-net, macvtap and qemu is being worked upon by Jason Wang and Krishna Kumar. Inspired by their work I had like to extend it a step further and discuss introducing open-vswitch based flows for multiqueue aware virtio-net queuing. This requires plumbing in openvswitch to utilize linux tc to instantiate QoS flows per queue in addition to the virtio-net multiqueue work. Open-vswitch also needs to incorporate support for opening tap fds multiple times so it can create as many queues. To this end openvswitch might want to become macvtap aware. There is a need to understand and discuss gaps in realizing openvswitch usecases in synchronization with features already implemented in macvtap and linux tc. For instance.. features like vepa, veb etc are implemented in the macvtap/macvlan driver only but are useful for openvswitch based flows too. I had like to discuss features/gaps that require plumbing in these subsystems and related work. *Required attendees(If present)* Developers like Jason Wang, Krishna Kumar, Michael Tsirkin, Arnd Bergmann, Stephen Hemminger, Dave Miller, open-vswitch developers, netdev developers, libvirt developers, qemu developers Topic Lead: Shyam Iyer <email address hidden> Shyam Iyer is a senior software engineer in Dell's Operating Sytems Advanced Engineering Group focused on Linux with over 8 years of experience in developing linux based solutions. Apart from enabling Dell PowerEdge Servers and Storage for Enterprise Linux Operating Systems he focuses on bridging new hardware technology usecases with emerging new Linux technologies. His interests encompass Server Hardware Architecture, Linux Kernel Debugging, Server Platform bringup, efficient storage, networking, Virtualization architectures and performance tuning.

Participants:
attending therbert (Tom Herbert)

Tracks:
  • Networking
Nautilus 5
Friday, 10:05 - 10:50 PDT
Not Attending Classification/Shaping && HW Rate Limiting && Open-vswitch ( Networking )
Networking Bufferbloat Topics 1. Linux Traffic Classification and Shaping 2. TC Interface to Hardware Rate Limiting 3. Harmonizing Multiqueue, Vmdq, virtio-net, macvtap with open-vswitch === Linux Traffic classification and Shaping === Linux provides advanced mechanism for traffic classification and shaping. Central to this role is the queuing discipline. Recently we have done work allowing hardware to offload some of these traditionally CPU intensive task and have experimented with mechanisms to improve performance on many-core systems. Here we would like to highlight work such as the queuing scheduler 'mqprio' that have recently been accepted upstream. As well as share results from experimental work running lockless queuing disciplines and classifiers on many-core systems and fat pipes (10Gbps and greater). Topic Lead: Tom Herbert === tc(?) interface to hardware transmit rate limiting === Intel 10 Gigabit hardware (and others) can provide transmit rate limiting. This presentation will discuss development of a new simple qdisc that can either provide all-software transmit rate limiting, or when installed over hardware that supports the capability, can directly configure the hardware's rate limiting. One problem that will need discussion is that the Intel hardware's rate limiting is per-queue. Another option besides a qdisc that could be discussed is direct ethtool control over the rate limiting. Topic Lead: Jesse Brandeburg Jesse is a senior Linux developer in the Intel LAN Access Division (Intel Ethernet). He has been with Intel since 1994, and has worked on the Linux e100, e1000, e1000e, igb, ixgb, ixgbe drivers since 2002. His time is split between solving customer issues, performance tuning Intel's drivers, and working on bleeding edge development for the Linux networking stack. === Harmonizing Multiqueue, Vmdq, virtio-net, macvtap with open-vswitch === Multiqueue virtio-net, macvtap and qemu is being worked upon by Jason Wang and Krishna Kumar. Inspired by their work I had like to extend it a step further and discuss introducing open-vswitch based flows for multiqueue aware virtio-net queuing. This requires plumbing in openvswitch to utilize linux tc to instantiate QoS flows per queue in addition to the virtio-net multiqueue work. Open-vswitch also needs to incorporate support for opening tap fds multiple times so it can create as many queues. To this end openvswitch might want to become macvtap aware. There is a need to understand and discuss gaps in realizing openvswitch usecases in synchronization with features already implemented in macvtap and linux tc. For instance.. features like vepa, veb etc are implemented in the macvtap/macvlan driver only but are useful for openvswitch based flows too. I had like to discuss features/gaps that require plumbing in these subsystems and related work. *Required attendees(If present)* Developers like Jason Wang, Krishna Kumar, Michael Tsirkin, Arnd Bergmann, Stephen Hemminger, Dave Miller, open-vswitch developers, netdev developers, libvirt developers, qemu developers Topic Lead: Shyam Iyer <email address hidden> Shyam Iyer is a senior software engineer in Dell's Operating Sytems Advanced Engineering Group focused on Linux with over 8 years of experience in developing linux based solutions. Apart from enabling Dell PowerEdge Servers and Storage for Enterprise Linux Operating Systems he focuses on bridging new hardware technology usecases with emerging new Linux technologies. His interests encompass Server Hardware Architecture, Linux Kernel Debugging, Server Platform bringup, efficient storage, networking, Virtualization architectures and performance tuning.

Participants:
attending therbert (Tom Herbert)

Tracks:
  • Networking
Nautilus 5
Friday, 11:00 - 11:45 PDT
Not Attending Network Virtualization and Lightning Talks ( Virtualization )
1. VFIO - Are We There Yet? 2. KVM Network performance and scalability 3. Enabling overlays for Network Scaling 4. Marrying live migration and device assignment 5. Lightning Talks   - QEMU disaggregation - Stefano Stabellini   - Xenner- Alexander Graf   - From Server to Mobile: Different Requirements/Different Solution - Eddie Dong, Jun Nakajima === VFIO - Are We There Yet? === VFIO is new userspace driver interface intended to generically enable assignment of devices into qemu virtual machines. VFIO has had a bumpy road upstream and is currently into it's second redesign. In this talk we'll look at the new design, the status of the code, how to make use of it, and where it's going. We'll also look back at some of the previous designs to show how we got here. This talk is intended for developers and users interested in the evolution of device assignment in qemu and kvm as well as those interested in userspace drivers. Topic Lead: Alex Williamson Alex has been working on virtualization for over 5 years and concentrates on the I/O side of virtualization, especially assignment of physical devices to virtual machines. He is a member of the Red Hat Virtualization team. === KVM Network performance and scalability === In this presentation we will discuss ongoing work to improve KVM networking I/O performance and scalability. We will share performance numbers taken using both vertical (multiple interfaces) and horizontal (many VMs) to highlight existing bottleneck's in the KVM stack as well as improvements observed with pending changes. These experiments have shown impressive gains can be obtained by using per-cpu vhost threads and leveraging hardware offloads. These offloads include flow steering and interrupt affinity. This presentation intends to highlight ongoing research from various groups working on the Linux kernel, KVM, and upper layer stack. Finally we will propose a path to include these changes in the upstream projects. This should be of interest to KVM developers, kernel developers, and anyone using a virtualized environment. Topic Lead: John Fastabend <email address hidden> Required attendees: Vivek Kashyap, Shyam Iyer === Enabling overlays for Network Scaling === Server virtualization in the data-center has increased the density of networking endpoints in a network. Together with the need to migrate VMs anywhere in the data-center this has surfaced network scalability limitations (layer 2, cross IP subnet migrations, network renumbering). The industry has turned its attention towards overlay networks to solve the network scalability problems. The overlay network concept defines a domain connecting virtual machines belonging to a single tenant or organization. This virtual network may be built across the server hypervisors which are connected over an arbitrary topology. This talk will give an overview of the problems sought to be solved through the use of overlay networks, and discusses the active proposals such as VxLAN, NVGRE, and DOVE Network. We further will delve into options for implementing the solutions on Linux. Topic Lead: Vivek Kashyap <email address hidden> Vivek works in IBM's Linux Technology Center. Vivek has worked on Linux resource management, delay-accounting, Energy & hardware management, and authored InfiniBand and IPoIB networking protocols, and worked on standardizing and implementing the IEEE 802.1Qbg protocol on network switching. === Marrying live migration and device assignment === Device assignment has been around for quite some time now in virtualization. It's a nice technique to squeeze as much performance out of your hardware as possible and with the advent of SR-IOV it's even possible to pass a "virtualized" fraction of your real hardware to a VM, not the whole card. The problem however is that you lose a pretty substantial piece of functionality: live migration. The most commonly approach used to counterfight this for networking is to pass 2 NICs to the VM. One that's emulated in software and one that's the actually assigned device. It's the guest's responsibility to treat the two as a union and the host needs to be configured in a way that allows packets to float the same way through both paths. When migrating, the assigned device gets hot unplugged and a new one goes back in on the new host. However, that means that we're exposing crucial implementation details of the VM to the guest: it knows when it gets migrated. Another approach is to do the above, but combine everything in a single guest driver, so it ends up invisible to the guest OS. That quickly becomes a nightmare too, because you need to reimplement network drivers for your specific guest driver infrastructure, at which point you're most likely violating the GPL anyway. So what if we restrict ourselves to a single NIC type? We could pass in an emulated version of that NIC into our guest, or pass through an assigned device. They would behave the same. That also means that during live migration, we could switch between emulated and assigned modes without the guest even realizing it. But maybe others have more ideas on how to improve the situation? The less guest intrusive it is, the better the solution usually becomes. And if it extends to storage, it's even better Required attendees Peter Waskiewicz Alex Williamson Topic Lead: Alexander Graf <email address hidden> Alexander has been a steady and long time contributor to the QEMU and KVM projects. He maintains the PowerPC and s390x parts of QEMU as well as the PowerPC port of KVM. He tends to become active, whenever areas seem weird enough for nobody else to touch them, such as nested virtualization, mac os virtualization or ahci. Recently, he has also been involved in kicking off openSUSE for ARM. His motto is なんとかなる.

Participants:
attending alex-l-williamson (Alex Williamson)
attending amitshah (Amit Shah)
attending eblake (Eric Blake)
attending lpc-virt-lead (LPC Virtualization Lead)

Tracks:
  • Virtualization
Nautilus 5
Friday, 11:55 - 12:40 PDT
Not Attending Network Virtualization and Lightning Talks ( Virtualization )
1. VFIO - Are We There Yet? 2. KVM Network performance and scalability 3. Enabling overlays for Network Scaling 4. Marrying live migration and device assignment 5. Lightning Talks   - QEMU disaggregation - Stefano Stabellini   - Xenner- Alexander Graf   - From Server to Mobile: Different Requirements/Different Solution - Eddie Dong, Jun Nakajima === VFIO - Are We There Yet? === VFIO is new userspace driver interface intended to generically enable assignment of devices into qemu virtual machines. VFIO has had a bumpy road upstream and is currently into it's second redesign. In this talk we'll look at the new design, the status of the code, how to make use of it, and where it's going. We'll also look back at some of the previous designs to show how we got here. This talk is intended for developers and users interested in the evolution of device assignment in qemu and kvm as well as those interested in userspace drivers. Topic Lead: Alex Williamson Alex has been working on virtualization for over 5 years and concentrates on the I/O side of virtualization, especially assignment of physical devices to virtual machines. He is a member of the Red Hat Virtualization team. === KVM Network performance and scalability === In this presentation we will discuss ongoing work to improve KVM networking I/O performance and scalability. We will share performance numbers taken using both vertical (multiple interfaces) and horizontal (many VMs) to highlight existing bottleneck's in the KVM stack as well as improvements observed with pending changes. These experiments have shown impressive gains can be obtained by using per-cpu vhost threads and leveraging hardware offloads. These offloads include flow steering and interrupt affinity. This presentation intends to highlight ongoing research from various groups working on the Linux kernel, KVM, and upper layer stack. Finally we will propose a path to include these changes in the upstream projects. This should be of interest to KVM developers, kernel developers, and anyone using a virtualized environment. Topic Lead: John Fastabend <email address hidden> Required attendees: Vivek Kashyap, Shyam Iyer === Enabling overlays for Network Scaling === Server virtualization in the data-center has increased the density of networking endpoints in a network. Together with the need to migrate VMs anywhere in the data-center this has surfaced network scalability limitations (layer 2, cross IP subnet migrations, network renumbering). The industry has turned its attention towards overlay networks to solve the network scalability problems. The overlay network concept defines a domain connecting virtual machines belonging to a single tenant or organization. This virtual network may be built across the server hypervisors which are connected over an arbitrary topology. This talk will give an overview of the problems sought to be solved through the use of overlay networks, and discusses the active proposals such as VxLAN, NVGRE, and DOVE Network. We further will delve into options for implementing the solutions on Linux. Topic Lead: Vivek Kashyap <email address hidden> Vivek works in IBM's Linux Technology Center. Vivek has worked on Linux resource management, delay-accounting, Energy & hardware management, and authored InfiniBand and IPoIB networking protocols, and worked on standardizing and implementing the IEEE 802.1Qbg protocol on network switching. === Marrying live migration and device assignment === Device assignment has been around for quite some time now in virtualization. It's a nice technique to squeeze as much performance out of your hardware as possible and with the advent of SR-IOV it's even possible to pass a "virtualized" fraction of your real hardware to a VM, not the whole card. The problem however is that you lose a pretty substantial piece of functionality: live migration. The most commonly approach used to counterfight this for networking is to pass 2 NICs to the VM. One that's emulated in software and one that's the actually assigned device. It's the guest's responsibility to treat the two as a union and the host needs to be configured in a way that allows packets to float the same way through both paths. When migrating, the assigned device gets hot unplugged and a new one goes back in on the new host. However, that means that we're exposing crucial implementation details of the VM to the guest: it knows when it gets migrated. Another approach is to do the above, but combine everything in a single guest driver, so it ends up invisible to the guest OS. That quickly becomes a nightmare too, because you need to reimplement network drivers for your specific guest driver infrastructure, at which point you're most likely violating the GPL anyway. So what if we restrict ourselves to a single NIC type? We could pass in an emulated version of that NIC into our guest, or pass through an assigned device. They would behave the same. That also means that during live migration, we could switch between emulated and assigned modes without the guest even realizing it. But maybe others have more ideas on how to improve the situation? The less guest intrusive it is, the better the solution usually becomes. And if it extends to storage, it's even better Required attendees Peter Waskiewicz Alex Williamson Topic Lead: Alexander Graf <email address hidden> Alexander has been a steady and long time contributor to the QEMU and KVM projects. He maintains the PowerPC and s390x parts of QEMU as well as the PowerPC port of KVM. He tends to become active, whenever areas seem weird enough for nobody else to touch them, such as nested virtualization, mac os virtualization or ahci. Recently, he has also been involved in kicking off openSUSE for ARM. His motto is なんとかなる.

Participants:
attending alex-l-williamson (Alex Williamson)
attending amitshah (Amit Shah)
attending eblake (Eric Blake)
attending lpc-virt-lead (LPC Virtualization Lead)

Tracks:
  • Virtualization
Nautilus 5
Friday, 15:05 - 15:50 PDT
Not Attending Unavailable
Room Unavailable
Nautilus 5

PLEASE NOTE The Linux Plumbers Conference 2012 schedule is still in a draft format and is subject to changes at any time.