netdev - Re: [PATCH net-next] net: generalize skb freeing deferral to per-cpu lists

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iLK5i9y5=iAHS=8+SinGkmGgEXR=xk=ATpnXPakD1j-vQ@mail.gmail.com>
Date:   Fri, 22 Apr 2022 09:50:33 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Paolo Abeni <pabeni@...hat.com>,
        netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] net: generalize skb freeing deferral to per-cpu lists

On Fri, Apr 22, 2022 at 9:40 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Thu, 21 Apr 2022 08:39:20 -0700 Eric Dumazet wrote:
> > 10 runs of one TCP_STREAM flow
>
> Was the test within a NUMA node or cross-node?

This was a NUMA host, but nothing done to force anything (no pinning,
both for sender and receiver)

>
> For my learning - could this change cause more cache line bouncing
> than individual per-socket lists for non-RFS setups. Multiple CPUs
> may try to queue skbs for freeing on one remove node.

I did tests as well in non-RFS setups, and got nice improvement as well
I could post them in v2 if you want.

The thing is that with a typical number of RX queues (typically 16 or
32 queues on a 100Gbit NIC),
there is enough sharding for this spinlock to be a non-issue.

Also, we could quite easily add some batching in a future patch, for
the cases where the number of RX queues
is too small.

(Each cpu could hold up to 8 or 16 skbs in a per-cpu cache, before
giving them back to alloc_cpu(s))


>
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 7dccbfd1bf5635c27514c70b4a06d3e6f74395dd..0162a9bdc9291e7aae967a044788d09bd2ef2423 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -3081,6 +3081,9 @@ struct softnet_data {
> >       struct sk_buff_head     input_pkt_queue;
> >       struct napi_struct      backlog;
> >
> > +     /* Another possibly contended cache line */
> > +     struct sk_buff_head     skb_defer_list ____cacheline_aligned_in_smp;
>
> If so maybe we can avoid some dirtying and use a single-linked list?
> No point modifying the cache line of the skb already on the list.

Good idea, I can think about it.

>
> > +     call_single_data_t  csd_defer;
> >  };