lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 24 Jan 2020 06:56:36 -0800 From: Luigi Rizzo <lrizzo@...gle.com> To: Daniel Borkmann <daniel@...earbox.net> Cc: netdev@...r.kernel.org, Jesper Dangaard Brouer <hawk@...nel.org>, "David S. Miller" <davem@...emloft.net>, sameehj@...zon.com, Toke Høiland-Jørgensen <toke@...hat.com> Subject: Re: [PATCH net-next] v2 net-xdp: netdev attribute to control xdpgeneric skb linearization On Fri, Jan 24, 2020 at 1:55 AM Daniel Borkmann <daniel@...earbox.net> wrote: > > On 1/24/20 12:20 AM, Luigi Rizzo wrote: > > Add a netdevice flag to control skb linearization in generic xdp mode. > > Among the various mechanism to control the flag, the sysfs > > interface seems sufficiently simple and self-contained. > > The attribute can be modified through > > /sys/class/net/<DEVICE>/xdp_linearize > > The default is 1 (on) > > > > On a kernel instrumented to grab timestamps around the linearization > > code in netif_receive_generic_xdp, and heavy netperf traffic with 1500b > > mtu, I see the following times (nanoseconds/pkt) > > > > The receiver generally sees larger packets so the difference is more > > significant. > > > > ns/pkt RECEIVER SENDER > > > > p50 p90 p99 p50 p90 p99 > > > > LINEARIZATION: 600ns 1090ns 4900ns 149ns 249ns 460ns > > NO LINEARIZATION: 40ns 59ns 90ns 40ns 50ns 100ns > > > > Signed-off-by: Luigi Rizzo <lrizzo@...gle.com> > > --- > > include/linux/netdevice.h | 3 ++- > > net/core/dev.c | 5 +++-- > > net/core/net-sysfs.c | 15 +++++++++++++++ > > 3 files changed, 20 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index 5ec3537fbdb1..b182f3cb0bf0 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -1959,7 +1959,8 @@ struct net_device { > > > > struct netdev_rx_queue *_rx; > > unsigned int num_rx_queues; > > - unsigned int real_num_rx_queues; > > + unsigned int real_num_rx_queues:31; > > + unsigned int xdp_linearize : 1; > > > > struct bpf_prog __rcu *xdp_prog; > > unsigned long gro_flush_timeout; > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 4dcc1b390667..13a671e45b61 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -4484,8 +4484,8 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > > * of XDP_PACKET_HEADROOM bytes. This is the guarantee that also > > * native XDP provides, thus we need to do it here as well. > > */ > > - if (skb_is_nonlinear(skb) || > > - skb_headroom(skb) < XDP_PACKET_HEADROOM) { > > + if (skb->dev->xdp_linearize && (skb_is_nonlinear(skb) || > > + skb_headroom(skb) < XDP_PACKET_HEADROOM)) { > > int hroom = XDP_PACKET_HEADROOM - skb_headroom(skb); > > int troom = skb->tail + skb->data_len - skb->end; > > I still think in order for this knob to be generally useful, we would need to > provide an equivalent of bpf_skb_pull_data() helper, which in generic XDP would then > pull in more data from non-linear section, and in native XDP would be a "no-op" since > the frame is already linear. Otherwise, as mentioned in previous thread, users would > have no chance to examine headers if they are not pre-pulled by the driver. I agree that eventually we should get there. But to be completely general, we need to remain compatible with older programs that are not aware of the new mode of operation. I see three possible ways: 1. make the pullup transparent (triggered in the interpreter or bound check emitted by the jit code); 2. provide a mechanism for programs to specify requirements at load time, (defaulting to "full packet, standard headroom"). 3. provide a mechanism to identify older programs and always linearize in those cases If we have #2, we can actually live without the pullup helper, so that seems to be a simpler course of action. This particular patch basically defers #2 to the operator. cheers luigi
Powered by blists - more mailing lists