lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <390fc59e-17ed-47eb-48ff-8dae93c9a718@suse.cz>
Date:   Mon, 14 Jun 2021 13:33:43 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Christoph Lameter <cl@...ux.com>,
        David Rientjes <rientjes@...gle.com>,
        Pekka Enberg <penberg@...nel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>
Subject: Re: [RFC v2 33/34] mm, slub: use migrate_disable() on PREEMPT_RT

On 6/14/21 1:16 PM, Sebastian Andrzej Siewior wrote:
> On 2021-06-14 13:07:14 [+0200], Vlastimil Babka wrote:
>> > +#ifdef CONFIG_PREEMPT_RT
>> > +#define slub_get_cpu_ptr(var)	get_cpu_ptr(var)
>> > +#define slub_put_cpu_ptr(var)	put_cpu_ptr(var)
>> 
>> After Mel's report and bisect pointing to this patch, I realized I got the
>> #ifdef wrong and it should be #ifnded
> 
> So if you got the ifdef wrong (and kept everything as-is) then you
> tested the RT version on !RT. migrate_disable() behaves on !RT as on RT.
> As per changelog you don't use migrate_disable() unconditionally because
> it increases the overhead on !RT.

Correct.

> I haven't looked at the series and I have just this tiny question: why
> did migrate_disable() crash for Mel on !RT and why do you expect that it
> does not happen on PREEMPT_RT?

Right, so it's because __slab_alloc() has this optimization to avoid re-reading
'c' in case there is no preemption enabled at all (or it's just voluntary).

#ifdef CONFIG_PREEMPTION
        /*
         * We may have been preempted and rescheduled on a different
         * cpu before disabling preemption. Need to reload cpu area
         * pointer.
         */
        c = slub_get_cpu_ptr(s->cpu_slab);
#endif

Mel's config has CONFIG_PREEMPT_VOLUNTARY, which means CONFIG_PREEMPTION is not
enabled.

But then later in ___slab_alloc() we have

        slub_put_cpu_ptr(s->cpu_slab);
        page = new_slab(s, gfpflags, node);
        c = slub_get_cpu_ptr(s->cpu_slab);

And this is not hidden under CONFIG_PREEMPTION, so with the #ifdef bug the
slub_put_cpu_ptr did a migrate_enable() with Mel's config, without prior
migrate_disable().

If there wasn't the #ifdef PREEMPT_RT bug:
- this slub_put_cpu_ptr() would translate to put_cpu_ptr() thus
preempt_enable(), which on this config is just a barrier(), so it doesn't matter
that there was no matching preempt_disable() before.
- with PREEMPT_RT the CONFIG_PREEMPTION would be enabled, so the
slub_get_cpu_ptr() would do a migrate_disable() and there's no imbalance.

But now that I dig into this in detail, I can see there might be another
instance of this imbalance bug, if CONFIG_PREEMPTION is disabled, but
CONFIG_PREEMPT_COUNT is enabled, which seems to be possible in some debug
scenarios. Because then preempt_disable()/preempt_enable() still manipulate the
preempt counter and compiling them out in __slab_alloc() will cause imbalance.

So I think the guards in __slab_alloc() should be using CONFIG_PREEMPT_COUNT
instead of CONFIG_PREEMPT to be correct on all configs. I dare not remove them
completely :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ