linux-kernel - Re: [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160621042249.GA18870@cmpxchg.org>
Date:	Tue, 21 Jun 2016 00:22:49 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	Vlastimil Babka <vbabka@...e.cz>,
	Dave Chinner <david@...morbit.com>, linux-mm@...ck.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD

On Mon, Jun 20, 2016 at 10:08:56AM +0200, Michal Hocko wrote:
> On Fri 17-06-16 17:39:31, Johannes Weiner wrote:
> > On Fri, Jun 17, 2016 at 10:30:06PM +0200, Vlastimil Babka wrote:
> > > On 17.6.2016 20:22, Johannes Weiner wrote:
> [...]
> > > > - it allows !costly orders to fail
> > > > 
> > > > While 1. is obvious from the name, 2. is not. Even if we don't want
> > > > full-on fine-grained naming for every reclaim methodology and retry
> > > > behavior, those two things just shouldn't be tied together.
> > > 
> > > Well, if allocation is not allowed to fail, it's like trying "indefinitely hard"
> > > already. Telling it it should "try hard" then doesn't make any sense without
> > > also being able to fail.
> > 
> > I can see that argument, but it's really anything but obvious at the
> > callsite. Dave's response to Michal's patch was a good demonstration.
> > And I don't think adding comments fixes an unintuitive interface.
> 
> Yeah, I am aware of that. And it is unfortunate but a side effect of our
> !costly vs. costly difference in the default behavior. What I wanted
> to achieve was to have overrides for the default behavior (whatever it
> is). We already have two such flags and having something semantically in
> the middle sounds like a consistent way to me.

The "whatever it is" is the problem I'm having. It's one flag that
does two entirely orthogonal things, and it's quite reasonable for
somebody to want to change one behavior without the other.

"I can handle allocation failures, even when they are !costly"

has really nothing to do with

"I am ready to pay high a allocation latency to make costly
allocations succeed."

So while I understand that you want an effort-flag leveled somewhere
between NORETRY and NOFAIL, this looks more like a theoretical thing
than what existing callsites actually would want to use.

> > I think whether the best-effort behavior should be opt-in or opt-out,
> > or how fine-grained the latency/success control over the allocator
> > should be is a different topic. I'd prefer defaulting to reliability
> > and annotating low-latency requirements, but I can see TRY_HARD work
> > too. It just shouldn't imply MAY_FAIL.
> 
> It is always hard to change the default behavior without breaking
> anything. Up to now we had opt-in and as you can see there are not that
> many users who really wanted to have higher reliability. I guess this is
> because they just do not care and didn't see too many failures. The
> opt-out has also a disadvantage that we would need to provide a flag
> to tell to try less hard and all we have is NORETRY and that is way too
> easy. So to me it sounds like the opt-in fits better with the current
> usage.

For costly allocations, the presence of __GFP_NORETRY is exactly the
same as the absence of __GFP_REPEAT. So if we made __GFP_REPEAT the
default (and deleted the flag), the opt-outs would use __GFP_NORETRY
to restore their original behavior.

As for changing the default - remember that we currently warn about
allocation failures as if they were bugs, unless they are explicitely
allocated with the __GFP_NOWARN flag. We can assume that the current
__GFP_NOWARN sites are 1) commonly failing but 2) prefer to fall back
rather than incurring latency (otherwise they would have added the
__GFP_REPEAT flag). These sites would be a good list of candidates to
annotate with __GFP_NORETRY. If we made __GFP_REPEAT then the default,
the sites that would then try harder are the same sites that would now
emit page allocation failure warnings. These are rare, and the only
times I have seen them is under enough load that latency is shot to
hell anyway. So I'm not really convinced by the regression argument.

But that would *actually* clean up the flags, not make them even more
confusing:

Allocations that can't ever handle failure would use __GFP_NOFAIL.

Callers like XFS would use __GFP_MAYFAIL specifically to disable the
implicit __GFP_NOFAIL of !costly allocations.

Callers that would prefer falling back over killing and looping would
use __GFP_NORETRY.

Wouldn't that cover all usecases and be much more intuitive, both in
the default behavior as well as in the names of the flags?