lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 18 Aug 2023 14:32:47 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Matthew Wilcox <willy@...radead.org>,
 Jesper Dangaard Brouer <hawk@...nel.org>, Vlastimil Babka <vbabka@...e.cz>
Cc: brouer@...hat.com, netdev@...r.kernel.org, vbabka@...e.cz,
 Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, linux-mm@...ck.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 Mel Gorman <mgorman@...hsingularity.net>, Christoph Lameter <cl@...ux.com>,
 roman.gushchin@...ux.dev, dsterba@...e.com
Subject: Re: [PATCH net] net: use SLAB_NO_MERGE for kmem_cache
 skbuff_head_cache



On 15/08/2023 17.53, Matthew Wilcox wrote:
> On Tue, Aug 15, 2023 at 05:17:36PM +0200, Jesper Dangaard Brouer wrote:
>> For the bulk API to perform efficiently the slub fragmentation need to
>> be low. Especially for the SLUB allocator, the efficiency of bulk free
>> API depend on objects belonging to the same slab (page).
> 
> Hey Jesper,
> 
> You probably haven't seen this patch series from Vlastimil:
> 
> https://lore.kernel.org/linux-mm/20230810163627.6206-9-vbabka@suse.cz/
> 
> I wonder if you'd like to give it a try?  It should provide some immunity
> to this problem, and might even be faster than the current approach.
> If it isn't, it'd be good to understand why, and if it could be improved.

I took a quick look at:
  - 
https://lore.kernel.org/linux-mm/20230810163627.6206-11-vbabka@suse.cz/#Z31mm:slub.c

To Vlastimil, sorry but I don't think this approach with spin_lock will 
be faster than SLUB's normal fast-path using this_cpu_cmpxchg.

My experience is that SLUB this_cpu_cmpxchg trick is faster than spin_lock.

On my testlab CPU E5-1650 v4 @ 3.60GHz:
  - spin_lock+unlock : 34 cycles(tsc) 9.485 ns
  - this_cpu_cmpxchg :  5 cycles(tsc) 1.585 ns
  - locked cmpxchg   : 18 cycles(tsc) 5.006 ns

SLUB does use a cmpxchg_double which I don't have a microbench for.

> No objection to this patch going in for now, of course.
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ