lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56FEF2F8.7010508@redhat.com>
Date:	Fri, 1 Apr 2016 15:15:20 -0700
From:	Laura Abbott <labbott@...hat.com>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:	Christoph Lameter <cl@...ux.com>,
	Pekka Enberg <penberg@...nel.org>,
	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Laura Abbott <labbott@...oraproject.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Kees Cook <keescook@...omium.org>
Subject: Re: [RFC][PATCH] mm/slub: Skip CPU slab activation when debugging

On 03/31/2016 07:35 PM, Joonsoo Kim wrote:
> On Mon, Mar 28, 2016 at 03:53:01PM -0700, Laura Abbott wrote:
>> The per-cpu slab is designed to be the primary path for allocation in SLUB
>> since it assumed allocations will go through the fast path if possible.
>> When debugging is enabled, the fast path is disabled and per-cpu
>> allocations are not used. The current debugging code path still activates
>> the cpu slab for allocations and then immediately deactivates it. This
>> is useless work. When a slab is enabled for debugging, skip cpu
>> activation.
>>
>> Signed-off-by: Laura Abbott <labbott@...oraproject.org>
>> ---
>> This is a follow on to the optimization of the debug paths for poisoning
>> With this I get ~2 second drop on hackbench -g 20 -l 1000 with slub_debug=P
>> and no noticable change with slub_debug=- .
>
> I'd like to know the performance difference between slub_debug=P and
> slub_debug=- with this change.
>

with the hackbench benchmark

slub_debug=- 6.834

slub_debug=P 8.059


so ~1.2 second difference.

> Although this patch increases hackbench performance, I'm not sure it's
> sufficient for the production system. Concurrent slab allocation request
> will contend the node lock in every allocation attempt. So, there would be
> other ues-cases that performance drop due to slub_debug=P cannot be
> accepted even if it is security feature.
>

Hmmm, I hadn't considered that :-/

> How about allowing cpu partial list for debug cases?
> It will not hurt fast path and will make less contention on the node
> lock.
>

That helps more than this patch! It brings slub_debug=P down to 7.535
with the same relaxing of restrictions of CMPXCHG (allow the partials
with poison or redzoning, restrict otherwise).

It still seems unfortunate that deactive_slab takes up so much time
of __slab_alloc. I'll give some more thought about trying to skip
the CPU slab activation with the cpu partial list.

> Thanks.
>

Thanks,
Laura

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ