Among the various ways of using eBPF, OVS has been exploring the power
of eBPF in three: (1) attaching eBPF to TC, (2) offloading a subset of
processing to XDP, and (3) by-passing the kernel using AF_XDP.
Unfortunately, as of today, none of the three approaches satisfies the
requirements of OVS. In this presentation, we’d like to share the
challenges we faced, experience learned, and seek for feedbacks from
the community for future direction.
Attaching eBPF to TC started first with the most aggressive goal: we
planned to re-implement the entire features of OVS kernel datapath
under net/openvswitch/* into eBPF code. We worked around a couple of
limitations, for example, the lack of TLV support led us to redefine a
binary kernel-user API using a fixed-length array; and without a
dedicated way to execute a packet, we created a dedicated device for
user to kernel packet transmission, with a different BPF program
attached to handle packet execute logic. Currently, we are working on
connection tracking. Although a simple eBPF map can achieve basic
operations of conntrack table lookup and commit, how to handle NAT,
(de)fragmentation, and ALG are still under discussion.
Moving one layer below TC is called XDP (eXpress Data Path), a much
faster layer for packet processing, but with almost no extra packet
metadata and limited BPF helpers support. Depending on the complexity
of flows, OVS can offload a subset of its flow processing to XDP when
feasible. However, the fact that XDP has fewer helper function support
implies that either 1) only very limited number of flows are eligible
for offload, or 2) more flow processing logic needed to be done in
AF_XDP represents another form of XDP, with a socket interface for
control plane and a shared memory API for accessing packets from
userspace applications. OVS today has another full-fledged datapath
implementation in userspace, called dpif-netdev, used by DPDK
community. By treating the AF_XDP as a fast packet-I/O channel, the
OVS dpif-netdev can satisfy almost all existing features. We are
working on building the prototype and evaluating its performance.
OVS eBPF datapath.