[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YWaYUsXgXS6GXM+M@dhcp22.suse.cz>
Date: Wed, 13 Oct 2021 10:26:58 +0200
From: Michal Hocko <mhocko@...e.com>
To: Dave Chinner <david@...morbit.com>
Cc: NeilBrown <neilb@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
"Darrick J. Wong" <djwong@...nel.org>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>, Jonathan Corbet <corbet@....net>,
linux-xfs@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org
Subject: Re: [PATCH 2/6] MM: improve documentation for __GFP_NOFAIL
On Wed 13-10-21 13:32:31, Dave Chinner wrote:
> On Mon, Oct 11, 2021 at 01:57:36PM +0200, Michal Hocko wrote:
> > On Sat 09-10-21 09:36:49, Dave Chinner wrote:
> > > On Fri, Oct 08, 2021 at 09:48:39AM +0200, Michal Hocko wrote:
> > > > > > > Even the API constaints of kvmalloc() w.r.t. only doing the vmalloc
> > > > > > > fallback if the gfp context is GFP_KERNEL - we already do GFP_NOFS
> > > > > > > kvmalloc via memalloc_nofs_save/restore(), so this behavioural
> > > > > > > restriction w.r.t. gfp flags just makes no sense at all.
> > > > > >
> > > > > > GFP_NOFS (without using the scope API) has the same problem as NOFAIL in
> > > > > > the vmalloc. Hence it is not supported. If you use the scope API then
> > > > > > you can GFP_KERNEL for kvmalloc. This is clumsy but I am not sure how to
> > > > > > define these conditions in a more sensible way. Special case NOFS if the
> > > > > > scope api is in use? Why do you want an explicit NOFS then?
> > >
> > > Exactly my point - this is clumsy and a total mess. I'm not asking
> > > for an explicit GFP_NOFS, just pointing out that the documented
> > > restrictions that "vmalloc can only do GFP_KERNEL allocations" is
> > > completely wrong.
> > >
> > > vmalloc()
> > > {
> > > if (!(gfp_flags & __GFP_FS))
> > > memalloc_nofs_save();
> > > p = __vmalloc(gfp_flags | GFP_KERNEL)
> > > if (!(gfp_flags & __GFP_FS))
> > > memalloc_nofs_restore();
> > > }
> > >
> > > Yup, that's how simple it is to support GFP_NOFS support in
> > > vmalloc().
> >
> > Yes, this would work from the functionality POV but it defeats the
> > philosophy behind the scope API. Why would you even need this if the
> > scope was defined by the caller of the allocator?
>
> Who actually cares that vmalloc might be using the scoped API
> internally to implement GFP_NOFS or GFP_NOIO? Nobody at all.
> It is far more useful (and self documenting!) for one-off allocations
> to pass a GFP_NOFS flag than it is to use a scope API...
I would agree with you if the explicit GFP_NOFS usage was consistent
and actually justified in the majority cases. My experience tells me
otherwise though. Many filesystems use the flag just because that is
easier. That leads to a huge overuse of the flag that leads to practical
problems.
I was hoping that if we offer an API that would define problematic
reclaim recursion scopes then it would reduce the abuse. I haven't
expected this to happen overnight but it is few years and it seems
it will not happen soon either.
[...]
> > > It also points out that the scope API is highly deficient.
> > > We can do GFP_NOFS via the scope API, but we can't
> > > do anything else because *there is no scope API for other GFP
> > > flags*.
> > >
> > > Why don't we have a GFP_NOFAIL/__GFP_RETRY_FOREVER scope API?
> >
> > NO{FS,IO} where first flags to start this approach. And I have to admit
> > the experiment was much less successful then I hoped for. There are
> > still thousands of direct NOFS users so for some reason defining scopes
> > is not an easy thing to do.
> >
> > I am not against NOFAIL scopes in principle but seeing the nofs
> > "success" I am worried this will not go really well either and it is
> > much more tricky as NOFAIL has much stronger requirements than NOFS.
> > Just imagine how tricky this can be if you just call a library code
> > that is not under your control within a NOFAIL scope. What if that
> > library code decides to allocate (e.g. printk that would attempt to do
> > an optimistic NOWAIT allocation).
>
> I already asked you that _exact_ question earlier in the thread
> w.r.t. kvmalloc(GFP_NOFAIL) using optimistic NOWAIT kmalloc
> allocation. I asked you as a MM expert to define *and document* the
> behaviour that should result, not turn around and use the fact that
> it is undefined behaviour as a "this is too hard" excuse for not
> changing anything.
Dave, you have "thrown" a lot of complains in previous emails and it is
hard to tell rants from features requests apart. I am sorry but I
believe it would be much more productive to continue this discussion if
you could mild your tone.
Can I ask you to break down your feature requests into separate emails
so that we can discuss and track them separately rather in this quite a
long thread which has IMHO diverghed from the initial topic. Thanks!
> THe fact is that the scope APIs are only really useful for certain
> contexts where restrictions are set by higher level functionality.
> For one-off allocation constraints the API sucks and we end up with
Could you be more specific about these one-off allocation constrains?
What would be the reason to define one-off NO{FS,IO} allocation
constrain? Or did you have your NOFAIL example in mind?
> crap like this (found in btrfs):
>
> /*
> * We're holding a transaction handle, so use a NOFS memory
> * allocation context to avoid deadlock if reclaim happens.
> */
> nofs_flag = memalloc_nofs_save();
> value = kmalloc(size, GFP_KERNEL);
> memalloc_nofs_restore(nofs_flag);
Yes this looks wrong indeed! If I were to review such a code I would ask
why the scope cannot match the transaction handle context. IIRC jbd does
that.
I am aware of these patterns. I was pulled in some discussions in the
past and in some it turned out that the constrain is not needed at all
and in some cases that has led to a proper scope definition. As you
point out in your other examples it just happens that it is easier to
go an easy path and define scopes ad-hoc to work around allocation
API limitations.
[...]
> IOWs, a large number of the users of the scope API simply make
> [k]vmalloc() provide GFP_NOFS behaviour. ceph_kvmalloc() is pretty
> much a wrapper that indicates how all vmalloc functions should
> behave. Honour GFP_NOFS and GFP_NOIO by using the scope API
> internally.
I was discouraging from this behavior at vmalloc level to push people
to use scopes properly - aka at the level where the reclaim recursion is
really a problem. If that is infeasible in practice then we can
re-evaluate of course. I was really hoping we can get rid of cargo cult
GFP_NOFS usage this way but the reality often disagrees with hopes.
All that being said, let's discuss [k]vmalloc constrains and usecases
that need changes in a separate email thread.
Thanks!
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists