linux-kernel - Re: [RFC 09/26] mm, slub: move disabling/enabling irqs to ___slab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f2e9187a-dea8-ef55-b815-9ac295b46919@suse.cz>
Date:   Tue, 25 May 2021 14:47:10 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Christoph Lameter <cl@...ux.com>,
        David Rientjes <rientjes@...gle.com>,
        Pekka Enberg <penberg@...nel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>
Subject: Re: [RFC 09/26] mm, slub: move disabling/enabling irqs to
 ___slab_alloc()

On 5/25/21 2:35 PM, Mel Gorman wrote:
> On Tue, May 25, 2021 at 01:39:29AM +0200, Vlastimil Babka wrote:
>> Currently __slab_alloc() disables irqs around the whole ___slab_alloc().  This
>> includes cases where this is not needed, such as when the allocation ends up in
>> the page allocator and has to awkwardly enable irqs back based on gfp flags.
>> Also the whole kmem_cache_alloc_bulk() is executed with irqs disabled even when
>> it hits the __slab_alloc() slow path, and long periods with disabled interrupts
>> are undesirable.
>> 
>> As a first step towards reducing irq disabled periods, move irq handling into
>> ___slab_alloc(). Callers will instead prevent the s->cpu_slab percpu pointer
>> from becoming invalid via migrate_disable(). This does not protect against
>> access preemption, which is still done by disabled irq for most of
>> ___slab_alloc(). As the small immediate benefit, slab_out_of_memory() call from
>> ___slab_alloc() is now done with irqs enabled.
>> 
>> kmem_cache_alloc_bulk() disables irqs for its fastpath and then re-enables them
>> before calling ___slab_alloc(), which then disables them at its discretion. The
>> whole kmem_cache_alloc_bulk() operation also disables cpu migration.
>> 
>> When  ___slab_alloc() calls new_slab() to allocate a new page, re-enable
>> preemption, because new_slab() will re-enable interrupts in contexts that allow
>> blocking.
>> 
>> The patch itself will thus increase overhead a bit due to disabled migration
>> and increased disabling/enabling irqs in kmem_cache_alloc_bulk(), but that will
>> be gradually improved in the following patches.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
> 
> Why did you use migrate_disable instead of preempt_disable? There is a
> fairly large comment in include/linux/preempt.h on why migrate_disable
> is undesirable so new users are likely to be put under the microscope
> once Thomas or Peter notice it.

I understood it as while undesirable, there's nothing better for now.

> I think you are using it so that an allocation request can be preempted by
> a higher priority task but given that the code was disabling interrupts,
> there was already some preemption latency.

Yes, and the disabled interrupts will get progressively "smaller" in the series.

> However, migrate_disable
> is more expensive than preempt_disable (function call versus a simple
> increment).

That's true, I think perhaps it could be reimplemented so that on !PREEMPT_RT
and with no lockdep/preempt/whatnot debugging it could just translate to an
inline migrate_disable?

> On that basis, I'd recommend starting with preempt_disable
> and only using migrate_disable if necessary.

That's certainly possible and you're right it would be a less disruptive step.
My thinking was that on !PREEMPT_RT it's actually just preempt_disable (however
with the call overhead currently), but PREEMPT_RT would welcome the lack of
preempt disable. I'd be interested to hear RT guys opinion here.

> Bonus points for adding a comment where ___slab_alloc disables IRQs to
> clarify what is protected -- I assume it's protecting kmem_cache_cpu
> from being modified from interrupt context. If so, it's potentially a
> local_lock candidate.

Yeah that gets cleared up later :)