lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Jul 2021 09:10:51 +0200
From:   Paolo Abeni <>
To:     Casey Schaufler <>,
Cc:     "David S. Miller" <>,
        Jakub Kicinski <>,
        Florian Westphal <>,
        Eric Dumazet <>,,
Subject: Re: [PATCH RFC 0/9] sk_buff: optimize layout for GRO


On Wed, 2021-07-21 at 11:15 -0700, Casey Schaufler wrote:
> On 7/21/2021 9:44 AM, Paolo Abeni wrote:
> > This is a very early draft - in a different world would be
> > replaced by hallway discussion at in-person conference - aimed at
> > outlining some ideas and collect feedback on the overall outlook.
> > There are still bugs to be fixed, more test and benchmark need, etc.
> > 
> > There are 3 main goals:
> > - [try to] avoid the overhead for uncommon conditions at GRO time
> >   (patches 1-4)
> > - enable backpressure for the veth GRO path (patches 5-6)
> > - reduce the number of cacheline used by the sk_buff lifecycle
> >   from 4 to 3, at least in some common scenarios (patches 1,7-9).
> >   The idea here is avoid the initialization of some fields and
> >   control their validity with a bitmask, as presented by at least
> >   Florian and Jesper in the past.
> If I understand correctly, you're creating an optimized case
> which excludes ct, secmark, vlan and UDP tunnel. Is this correct,
> and if so, why those particular fields? What impact will this have
> in the non-optimal (with any of the excluded fields) case?

Thank you for the feedback.

There are 2 different relevant points:

- the GRO stage.
  packets carring any of CT, dst, sk or skb_ext will do 2 additional
conditionals per gro_receive WRT the current code. My understanding is
that having any of such field set at GRO receive time is quite
exceptional for real nic. All others packet will do 4 or 5 less
conditionals, and will traverse a little less code.

- sk_buff lifecycle
  * packets carrying vlan and UDP will not see any differences: sk_buff
lifecycle will stil use 4 cachelines, as currently does, and no
additional conditional is introduced.
  * packets carring nfct or secmark will see an additional conditional
every time such field is accessed. The number of cacheline used will
still be 4, as in the current code. My understanding is that when such
access happens, there is already a relevant amount of "additional" code
to be executed, the conditional overhead should not be measurable.



Powered by blists - more mailing lists