lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 20 Apr 2018 15:48:36 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     brouer@...hat.com, Paolo Abeni <pabeni@...hat.com>,
        netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs
 cache


On Thu, 19 Apr 2018 06:47:10 -0700 Eric Dumazet <eric.dumazet@...il.com> wrote:
> On 04/19/2018 12:40 AM, Paolo Abeni wrote:
> > On Wed, 2018-04-18 at 12:21 -0700, Eric Dumazet wrote:  
> >> On 04/18/2018 10:15 AM, Paolo Abeni wrote:
[...]
> > 
> > Any suggestions for better results are more than welcome!  
> 
> Yes, remote skb freeing. I mentioned this idea to Jesper and Tariq in
> Seoul (netdev conference). Not tied to UDP, but a generic solution.

Yes, I remember.  I think... was it the idea, where you basically
wanted to queue back SKBs to the CPU that allocated them, right?

Freeing an SKB on the same CPU that allocated it, have multiple
advantages. (1) the SLUB allocator can use a non-atomic
"cpu-local" (double)cmpxchg. (2) the 4 cache-lines memset cleared of
the SKB stay local.  (3) the atomic SKB refcnt/users stay local.

We just have to avoid that queue back SKB's mechanism, doesn't cost
more than the operations we expect to save.  Bulk transfer is an
obvious approach.  For storing SKBs until they are returned, we already
have a fast mechanism see napi_consume_skb calling _kfree_skb_defer,
which SLUB/SLAB-bulk free to amortize cost (1).

I guess, the missing information is that we don't know what CPU the SKB
were created on...

Where to store this CPU info?

(a) In struct sk_buff, in a cache-line that is already read on remote
CPU in UDP code?

(b) In struct page, as SLUB alloc hand-out objects/SKBs on a per page
basis, we could have SLUB store a hint about the CPU it was allocated
on, and bet on returning to that CPU ? (might be bad to read the
struct-page cache-line)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ