lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO3-PbpB-Wqji-9vFifCTExah-ctRkPSpz60EQvsA=oYdPpQZQ@mail.gmail.com>
Date: Fri, 21 Jun 2024 10:40:44 -0500
From: Yan Zhai <yan@...udflare.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	Willem de Bruijn <willemb@...gle.com>, Simon Horman <horms@...nel.org>, Florian Westphal <fw@...len.de>, 
	Mina Almasry <almasrymina@...gle.com>, Abhishek Chauhan <quic_abchauha@...cinc.com>, 
	David Howells <dhowells@...hat.com>, David Ahern <dsahern@...nel.org>, 
	Richard Gobert <richardbgobert@...il.com>, Antoine Tenart <atenart@...nel.org>, 
	Felix Fietkau <nbd@....name>, Soheil Hassas Yeganeh <soheil@...gle.com>, 
	Pavel Begunkov <asml.silence@...il.com>, Lorenzo Bianconi <lorenzo@...nel.org>, 
	Thomas Weißschuh <linux@...ssschuh.net>, 
	linux-kernel@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [RFC net-next 1/9] skb: introduce gro_disabled bit

On Fri, Jun 21, 2024 at 4:13 AM Alexander Lobakin
<aleksander.lobakin@...el.com> wrote:
>
> From: Yan Zhai <yan@...udflare.com>
> Date: Thu, 20 Jun 2024 15:19:10 -0700
>
> > Software GRO is currently controlled by a single switch, i.e.
> >
> >   ethtool -K dev gro on|off
> >
> > However, this is not always desired. When GRO is enabled, even if the
> > kernel cannot GRO certain traffic, it has to run through the GRO receive
> > handlers with no benefit.
> >
> > There are also scenarios that turning off GRO is a requirement. For
> > example, our production environment has a scenario that a TC egress hook
> > may add multiple encapsulation headers to forwarded skbs for load
> > balancing and isolation purpose. The encapsulation is implemented via
> > BPF. But the problem arises then: there is no way to properly offload a
> > double-encapsulated packet, since skb only has network_header and
> > inner_network_header to track one layer of encapsulation, but not two.
>
> Implement it in the kernel then? :D
>
It would be a big commitment that I dare not make :) Out of curiosity,
is it something that devices can handle today?

> > On the other hand, not all the traffic through this device needs double
> > encapsulation. But we have to turn off GRO completely for any ingress
> > device as a result.
> >
> > Introduce a bit on skb so that GRO engine can be notified to skip GRO on
> > this skb, rather than having to be 0-or-1 for all traffic.
> >
> > Signed-off-by: Yan Zhai <yan@...udflare.com>
> > ---
> >  include/linux/netdevice.h |  9 +++++++--
> >  include/linux/skbuff.h    | 10 ++++++++++
> >  net/Kconfig               | 10 ++++++++++
> >  net/core/gro.c            |  2 +-
> >  net/core/gro_cells.c      |  2 +-
> >  net/core/skbuff.c         |  4 ++++
> >  6 files changed, 33 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index c83b390191d4..2ca0870b1221 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2415,11 +2415,16 @@ struct net_device {
> >       ((dev)->devlink_port = (port));                         \
> >  })
> >
> > -static inline bool netif_elide_gro(const struct net_device *dev)
> > +static inline bool netif_elide_gro(const struct sk_buff *skb)
> >  {
> > -     if (!(dev->features & NETIF_F_GRO) || dev->xdp_prog)
> > +     if (!(skb->dev->features & NETIF_F_GRO) || skb->dev->xdp_prog)
> >               return true;
> > +
> > +#ifdef CONFIG_SKB_GRO_CONTROL
> > +     return skb->gro_disabled;
> > +#else
> >       return false;
> > +#endif
> >  }
> >
> >  #define      NETDEV_ALIGN            32
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index f4cda3fbdb75..48b10ece95b5 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -1008,6 +1008,9 @@ struct sk_buff {
> >  #if IS_ENABLED(CONFIG_IP_SCTP)
> >       __u8                    csum_not_inet:1;
> >  #endif
> > +#ifdef CONFIG_SKB_GRO_CONTROL
> > +     __u8                    gro_disabled:1;
> > +#endif
> >
> >  #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
> >       __u16                   tc_index;       /* traffic control index */
> > @@ -1215,6 +1218,13 @@ static inline bool skb_wifi_acked_valid(const struct sk_buff *skb)
> >  #endif
> >  }
> >
> > +static inline void skb_disable_gro(struct sk_buff *skb)
> > +{
> > +#ifdef CONFIG_SKB_GRO_CONTROL
> > +     skb->gro_disabled = 1;
> > +#endif
> > +}
> > +
> >  /**
> >   * skb_unref - decrement the skb's reference count
> >   * @skb: buffer
> > diff --git a/net/Kconfig b/net/Kconfig
> > index 9fe65fa26e48..47d1ee92df15 100644
> > --- a/net/Kconfig
> > +++ b/net/Kconfig
> > @@ -289,6 +289,16 @@ config MAX_SKB_FRAGS
> >         and in drivers using build_skb().
> >         If unsure, say 17.
> >
> > +config SKB_GRO_CONTROL
> > +     bool "allow disable GRO on per-packet basis"
> > +     default y
> > +     help
> > +       By default GRO can only be enabled or disabled per network device.
> > +       This can be cumbersome for certain scenarios.
> > +       Toggling this option will allow disabling GRO for selected packets,
> > +       e.g. by XDP programs, so that it is more flexibile.
> > +       Extra overhead should be minimal.
>
> I don't think we need a Kconfig option for that. Can't it be
> unconditional? Is there any real eye-visible overhead?

Normally if it is a single branch I would not worry about it. But I
know I am touching a hot potato here so I just want to be cautious :)

best
Yan

>
> Thanks,
> Olek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ