[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201002080624.GB3227@techsingularity.net>
Date: Fri, 2 Oct 2020 09:06:24 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: Uladzislau Rezki <urezki@...il.com>
Cc: Vlastimil Babka <vbabka@...e.cz>,
LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Michal Hocko <mhocko@...e.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Theodore Y . Ts'o" <tytso@....edu>,
Joel Fernandes <joel@...lfernandes.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>
Subject: Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.
On Thu, Oct 01, 2020 at 09:26:26PM +0200, Uladzislau Rezki wrote:
> >
> > No, I meant going back to idea of new gfp flag, but adjust the implementation in
> > the allocator (different from what you posted in previous version) so that it
> > only looks at the flag after it tries to allocate from pcplist and finds out
> > it's empty. So, no inventing of new page allocator entry points or checks such
> > as the one you wrote above, but adding the new gfp flag in a way that it doesn't
> > affect existing fast paths.
> >
> OK. Now i see. Please have a look below at the patch, so we fully understand
> each other. If that is something that is close to your view or not:
>
> <snip>
> t a/include/linux/gfp.h b/include/linux/gfp.h
> index c603237e006c..7e613560a502 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -39,8 +39,9 @@ struct vm_area_struct;
> #define ___GFP_HARDWALL 0x100000u
> #define ___GFP_THISNODE 0x200000u
> #define ___GFP_ACCOUNT 0x400000u
> +#define ___GFP_NO_LOCKS 0x800000u
> #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP 0x800000u
> +#define ___GFP_NOLOCKDEP 0x1000000u
> #else
> #define ___GFP_NOLOCKDEP 0
> #endif
> @@ -215,16 +216,22 @@ struct vm_area_struct;
> * %__GFP_COMP address compound page metadata.
> *
> * %__GFP_ZERO returns a zeroed page on success.
> + *
> + * %__GFP_NO_LOCKS order-0 allocation without sleepable-locks.
> + * It obtains a page from the per-cpu-list and considered as
> + * lock-less. No other actions are performed, thus it returns
> + * NULL if per-cpu-list is empty.
> */
> #define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN)
> #define __GFP_COMP ((__force gfp_t)___GFP_COMP)
> #define __GFP_ZERO ((__force gfp_t)___GFP_ZERO)
> +#define __GFP_NO_LOCKS ((__force gfp_t)___GFP_NO_LOCKS)
>
I'm not a fan of the GFP flag approach simply because we've had cases
before where GFP flags were used in inappropriate contexts like
__GFP_MEMALLOC which led to a surprising amount of bugs, particularly
from out-of-tree drivers but also in-tree drivers. Of course, there
are limited GFP flags available too but at least the comment should
be as robust as possible. Maybe something like
* %__GFP_NO_LOCKS attempts order-0 allocation without sleepable-locks. It
* attempts to obtain a page without acquiring any spinlocks. This
* should only be used in a context where the holder holds a
* raw_spin_lock that cannot be released for the allocation request.
* This may be necessary in PREEMPT_RT kernels where a
* raw_spin_lock is held which does not sleep tries to acquire a
* spin_lock that can sleep with PREEMPT_RT. This should not be
* confused with GFP_ATOMIC contexts. Like atomic allocation
* requests, there is no guarantee a page will be returned and
* the caller must be able to deal with allocation failures.
* The risk of allocation failure is higher than using GFP_ATOMIC.
It's verbose but it would be hard to misinterpret. I think we're
going to go through a period of time before people get familiar
with PREEMPT_RT-related hazards as various comments that were
true are going to be misleading for a while.
For anyone reviewing, any use of __GFP_NO_LOCKS should meet a high
standard where there is no alternative except to use the flags. i.e. a
higher standard "but I'm an important driver".
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists