linux-kernel - Re: [RFC-PATCH 1/2] mm: Add __GFP_NO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874kp6llzb.fsf@nanos.tec.linutronix.de>
Date:   Thu, 13 Aug 2020 15:22:00 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Uladzislau Rezki <urezki@...il.com>, Michal Hocko <mhocko@...e.com>
Cc:     paulmck@...nel.org, Uladzislau Rezki <urezki@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Matthew Wilcox <willy@...radead.org>,
        "Theodore Y . Ts'o" <tytso@....edu>,
        Joel Fernandes <joel@...lfernandes.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

Uladzislau Rezki <urezki@...il.com> writes:
> On Thu, Aug 13, 2020 at 09:50:27AM +0200, Michal Hocko wrote:
>> On Wed 12-08-20 02:13:25, Thomas Gleixner wrote:
>> [...]
>> > I can understand your rationale and what you are trying to solve. So, if
>> > we can actually have a distinct GFP variant:
>> > 
>> >   GFP_I_ABSOLUTELY_HAVE_TO_DO_THAT_AND_I_KNOW_IT_CAN_FAIL_EARLY
>> 
>> Even if we cannot make the zone->lock raw I would prefer to not
>> introduce a new gfp flag. Well we can do an alias for easier grepping
>> #define GFP_RT_SAFE	0

Just using 0 is sneaky but yes, that's fine :)

Bikeshedding: GFP_RT_NOWAIT or such might be more obvious.

>> that would imply nowait semantic and would exclude waking up kswapd as
>> well. If we can make wake up safe under RT then the alias would reflect
>> that without any code changes.

It basically requires to convert the wait queue to something else. Is
the waitqueue strict single waiter?

>> The second, and the more important part, would be to bail out anytime
>> the page allocator is to take a lock which is not allowed in the current
>> RT context. Something like

>> +	/*
>> +	 * Hard atomic contexts are not supported by the allocator for
>> +	 * anything but pcp requests
>> +	 */
>> +	if (!preemtable())

If you make that preemtible() it might even compile, but that still wont
work because if CONFIG_PREEMPT_COUNT=n then preemptible() is always
false.

So that should be:

	if (!preemptible() && gfp == GFP_RT_NOWAIT)

which is limiting the damage to those callers which hand in
GFP_RT_NOWAIT.

lockdep will yell at invocations with gfp != GFP_RT_NOWAIT when it hits
zone->lock in the wrong context. And we want to know about that so we
can look at the caller and figure out how to solve it.

>> > The page allocator allocations should also have a limit on the number of
>> > pages and eventually also page order (need to stare at the code or let
>> > Michal educate me that the order does not matter).
>> 
>> In practice anything but order 0 is out of question because we need
>> zone->lock for that currently. Maybe we can introduce pcp lists for
>> higher orders in the future - I have a vague recollection Mel was
>> playing with that some time ago.

Ok.
 
>> > To make it consistent the same GFP_ variant should allow the slab
>> > allocator go to the point where the slab cache is exhausted.
>> > 
>> > Having a distinct and clearly defined GFP_ variant is really key to
>> > chase down offenders and to make reviewers double check upfront why this
>> > is absolutely required.
>> 
>> Having a high level and recognizable gfp mask is OK but I would really
>> like not to introduce a dedicated flag. The page allocator should be
>> able to recognize the context which cannot be handled.

The GFP_xxx == 0 is perfectly fine.

> Sorry for jumping in. We can rely on preemptable() for sure, if CONFIG_PREEMPT_RT
> is enabled, something like below:
>
> if (IS_ENABLED_RT && preemptebale())

Ha, you morphed preemtable() into preemptebale() which will not compile
either :)

Thanks,

        tglx