linux-ext4 - Re: [PATCH 3/6] EXT4: Remove ENOMEM/congestion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <163174534006.3992.15394603624652359629@noble.neil.brown.name>
Date:   Thu, 16 Sep 2021 08:35:40 +1000
From:   "NeilBrown" <neilb@...e.de>
To:     "Michal Hocko" <mhocko@...e.com>
Cc:     "Mel Gorman" <mgorman@...e.de>,
        "Andrew Morton" <akpm@...ux-foundation.org>,
        "Theodore Ts'o" <tytso@....edu>,
        "Andreas Dilger" <adilger.kernel@...ger.ca>,
        "Darrick J. Wong" <djwong@...nel.org>, "Jan Kara" <jack@...e.cz>,
        "Matthew Wilcox" <willy@...radead.org>, linux-xfs@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-nfs@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/6] EXT4: Remove ENOMEM/congestion_wait() loops.

On Wed, 15 Sep 2021, Michal Hocko wrote:
> On Wed 15-09-21 07:48:11, Neil Brown wrote:
> > 
> > Why does __GFP_NOFAIL access the reserves? Why not require that the
> > relevant "Try harder" flag (__GFP_ATOMIC or __GFP_MEMALLOC) be included
> > with __GFP_NOFAIL if that is justified?
> 
> Does 5020e285856c ("mm, oom: give __GFP_NOFAIL allocations access to
> memory reserves") help?

Yes, that helps.  A bit.

I'm not fond of the clause "the allocation request might have come with some
locks held".  What if it doesn't?  Does it still have to pay the price.

Should we not require that the caller indicate if any locks are held?
That way callers which don't hold locks can use __GFP_NOFAIL without
worrying about imposing on other code.

Or is it so rare that __GFP_NOFAIL would be used without holding a lock
that it doesn't matter?

The other commit of interest is

Commit: 6c18ba7a1899 ("mm: help __GFP_NOFAIL allocations which do not trigger OOM killer")

I don't find the reasoning convincing.  It is a bit like "Robbing Peter
to pay Paul".  It takes from the reserves to allow a __GFP_NOFAIL to
proceed, with out any reason to think this particular allocation has any
more 'right' to the reserves than anything else.

While I don't like the reasoning in either of these, they do make it
clear (to me) that the use of reserves is entirely an internal policy
decision.  They should *not* be seen as part of the API and callers
should not have to be concerned about it when deciding whether to use
__GFP_NOFAIL or not.

The use of these reserves is, at most, a hypothetical problem.  If it
ever looks like becoming a real practical problem, it needs to be fixed
internally to the page allocator.  Maybe an extra water-mark which isn't
quite as permissive as ALLOC_HIGH...

I'm inclined to drop all references to reserves from the documentation
for __GFP_NOFAIL.  I think there are enough users already that adding a
couple more isn't going to make problems substantially more likely.  And
more will be added anyway that the mm/ team won't have the opportunity
or bandwidth to review.

Meanwhile I'll see if I can understand the intricacies of alloc_page so
that I can contibute to making it more predictable.

Question: In those cases where an open-coded loop is appropriate, such
as when you want to handle signals or can drop locks, how bad would it
be to have a tight loop without any sleep?
should_reclaim_retry() will sleep 100ms (sometimes...).  Is that enough?
__GFP_NOFAIL doesn't add any sleep when looping.

Thanks,
NeilBrown