netdev - Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAF=yD-LkM3hzcY3B1P_5fW1t+QNtPz6=2YRr4P79t4hZW=0wTA@mail.gmail.com>
Date:   Sat, 21 Apr 2018 11:54:53 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Paolo Abeni <pabeni@...hat.com>,
        Network Development <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache

On Fri, Apr 20, 2018 at 9:48 AM, Jesper Dangaard Brouer
<brouer@...hat.com> wrote:
>
> On Thu, 19 Apr 2018 06:47:10 -0700 Eric Dumazet <eric.dumazet@...il.com> wrote:
>> On 04/19/2018 12:40 AM, Paolo Abeni wrote:
>> > On Wed, 2018-04-18 at 12:21 -0700, Eric Dumazet wrote:
>> >> On 04/18/2018 10:15 AM, Paolo Abeni wrote:
> [...]
>> >
>> > Any suggestions for better results are more than welcome!
>>
>> Yes, remote skb freeing. I mentioned this idea to Jesper and Tariq in
>> Seoul (netdev conference). Not tied to UDP, but a generic solution.
>
> Yes, I remember.  I think... was it the idea, where you basically
> wanted to queue back SKBs to the CPU that allocated them, right?
>
> Freeing an SKB on the same CPU that allocated it, have multiple
> advantages. (1) the SLUB allocator can use a non-atomic
> "cpu-local" (double)cmpxchg. (2) the 4 cache-lines memset cleared of
> the SKB stay local.  (3) the atomic SKB refcnt/users stay local.
>
> We just have to avoid that queue back SKB's mechanism, doesn't cost
> more than the operations we expect to save.  Bulk transfer is an
> obvious approach.  For storing SKBs until they are returned, we already
> have a fast mechanism see napi_consume_skb calling _kfree_skb_defer,
> which SLUB/SLAB-bulk free to amortize cost (1).
>
> I guess, the missing information is that we don't know what CPU the SKB
> were created on...

For connected sockets, sk->sk_incoming_cpu has this data. It
records BH cpu on enqueue to udp socket, so one caveat is that
it may be wrong with rps/rfs.

Another option is to associate not with source cpu but napi struct
and have the device driver free in the context of its napi processing.
This has the additional benefit that skb->napi_id is already stored
per skb, so this also works for unconnected sockets.

Third, the skb->napi_id field is unused after setting sk->sk_napi_id
on sk enqueue, so the BH cpu could be stored here after that,
essentially extending sk_incoming_cpu to unconnected sockets.