linux-kernel - Re: [RFC PATCH 3/4] xfs: map KM_MAYFAIL to __GFP_RETRY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170308125431.GI11028@dhcp22.suse.cz>
Date:   Wed, 8 Mar 2017 13:54:31 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:     linux-mm@...ck.org, Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Darrick J. Wong" <darrick.wong@...cle.com>
Subject: Re: [RFC PATCH 3/4] xfs: map KM_MAYFAIL to __GFP_RETRY_MAYFAIL

On Wed 08-03-17 20:23:37, Tetsuo Handa wrote:
> On 2017/03/08 0:48, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@...e.com>
> > 
> > KM_MAYFAIL didn't have any suitable GFP_FOO counterpart until recently
> > so it relied on the default page allocator behavior for the given set
> > of flags. This means that small allocations actually never failed.
> > 
> > Now that we have __GFP_RETRY_MAYFAIL flag which works independently on the
> > allocation request size we can map KM_MAYFAIL to it. The allocator will
> > try as hard as it can to fulfill the request but fails eventually if
> > the progress cannot be made.
> > 
> > Cc: Darrick J. Wong <darrick.wong@...cle.com>
> > Signed-off-by: Michal Hocko <mhocko@...e.com>
> > ---
> >  fs/xfs/kmem.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> > index ae08cfd9552a..ac80a4855c83 100644
> > --- a/fs/xfs/kmem.h
> > +++ b/fs/xfs/kmem.h
> > @@ -54,6 +54,16 @@ kmem_flags_convert(xfs_km_flags_t flags)
> >  			lflags &= ~__GFP_FS;
> >  	}
> >  
> > +	/*
> > +	 * Default page/slab allocator behavior is to retry for ever
> > +	 * for small allocations. We can override this behavior by using
> > +	 * __GFP_RETRY_MAYFAIL which will tell the allocator to retry as long
> > +	 * as it is feasible but rather fail than retry for ever for all
> > +	 * request sizes.
> > +	 */
> > +	if (flags & KM_MAYFAIL)
> > +		lflags |= __GFP_RETRY_MAYFAIL;
> 
> I don't see advantages of supporting both __GFP_NORETRY and __GFP_RETRY_MAYFAIL.
> kmem_flags_convert() can always set __GFP_NORETRY because the callers use
> opencoded __GFP_NOFAIL loop (with possible allocation lockup warning) unless
> KM_MAYFAIL is set.

The behavior would be different (e.g. the OOM killer handling).

[...]
> line, which is likely always true); but this is off-topic for this thread.

yes

[...]

> where both __GFP_NORETRY and __GFP_RETRY_MAYFAIL are checked after
> direct reclaim and compaction failed. __GFP_RETRY_MAYFAIL optimistically
> retries based on one of should_reclaim_retry() or should_compact_retry()
> or read_mems_allowed_retry() returns true or mutex_trylock(&oom_lock) in
> __alloc_pages_may_oom() returns 0. If !__GFP_FS allocation requests are
> holding oom_lock each other, __GFP_RETRY_MAYFAIL allocation requests (which
> are likely !__GFP_FS allocation requests due to __GFP_FS allocation requests
> being blocked on direct reclaim) can be blocked for uncontrollable duration
> without making progress. It seems to me that the difference between
> __GFP_NORETRY and __GFP_RETRY_MAYFAIL is not useful. Rather, the caller can
> set __GFP_NORETRY and retry with any control (e.g. set __GFP_HIGH upon first
> timeout, give up upon second timeout).

You are drown in implementation details here. Try to step back and think
about the high level semantic I would like to achieve - which is
essentially a middle ground between __GFP_NORETRY which doesn't retry
and __GFP_NOFAIL to retry for ever. There are users who could benefit
from such a semantic I believe (the most prominent example is kvmalloc
which has different modes of how hard to try kmalloc before giving up
and falling back to vmalloc)..

-- 
Michal Hocko
SUSE Labs