linux-kernel - Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d039e8e0-1d16-1fe5-c06b-1911b9dad323@linux.alibaba.com>
Date:   Mon, 10 Aug 2020 19:56:30 +0800
From:   xunlei <xlpang@...ux.alibaba.com>
To:     Pekka Enberg <penberg@...il.com>,
        Christopher Lameter <cl@...ux.com>
Cc:     Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Wen Yang <wenyang@...ux.alibaba.com>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Roman Gushchin <guro@...com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial
 objects

On 2020/8/8 上午1:28, Pekka Enberg wrote:
> Hi Christopher,
> 
> On Fri, 7 Aug 2020, Pekka Enberg wrote:
>>> I think we can just default to the counters. After all, if I
>>> understood correctly, we're talking about up to 100 ms time period
>>> with IRQs disabled when count_partial() is called. As this is
>>> triggerable from user space, that's a performance bug whatever way you
>>> look at it.
> 
> On Fri, Aug 7, 2020 at 4:02 PM Christopher Lameter <cl@...ux.com> wrote:
>> Well yes under extreme conditions and this is only happening for sysfs
>> counter retrieval.
> 
> You will likely get some stall even in less extreme conditions, and in
> any case, the kernel should not allow user space to trigger such a
> stall.
> 

Yes, agree. This problem has been causing lots of trouble to us and
other people, and should be fixed.

Either my approach or the approach provided by "Vlastimil Babka" [1] is
better than doing nothing.

[1]:
https://lore.kernel.org/linux-mm/158860845968.33385.4165926113074799048.stgit@buzz/

> On Fri, Aug 7, 2020 at 4:02 PM Christopher Lameter <cl@...ux.com> wrote:
>> There could be other solutions to this. This solution here is penalizing
>> evertu hotpath slab allocation for the sake of relatively infrequently
>> used counter monitoring. There the possibility of not traversing the list
>> ande simply estimating the value based on the number of slab pages
>> allocated on that node.
> 
> Why do you consider this to be a fast path? This is all partial list
> accounting when we allocate/deallocate a slab, no? Just like
> ___slab_alloc() says, I assumed this to be the slow path... What am I
> missing?

The only hot path is __slab_free(), I've made an extra patch with percpu
counter to avoid the potential performance degradation, will send v2 out
for review.

> 
> No objections to alternative fixes, of course, but wrapping the
> counters under CONFIG_DEBUG seems like just hiding the actual issue...
> 
> - Pekka
>