linux-kernel - Re: [PATCH 1/3] tree wide: get rid of __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5641185F.9020104@suse.cz>
Date:	Mon, 9 Nov 2015 23:04:15 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	mhocko@...nel.org, linux-mm@...ck.org
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH 1/3] tree wide: get rid of __GFP_REPEAT for order-0
 allocations part I

On 5.11.2015 17:15, mhocko@...nel.org wrote:
> From: Michal Hocko <mhocko@...e.com>
> 
> __GFP_REPEAT has a rather weak semantic but since it has been introduced
> around 2.6.12 it has been ignored for low order allocations. Yet we have
> the full kernel tree with its usage for apparently order-0 allocations.
> This is really confusing because __GFP_REPEAT is explicitly documented
> to allow allocation failures which is a weaker semantic than the current
> order-0 has (basically nofail).
> 
> Let's simply reap out __GFP_REPEAT from those places. This would allow
> to identify place which really need allocator to retry harder and
> formulate a more specific semantic for what the flag is supposed to do
> actually.

So at first I thought "yeah that's obvious", but then after some more thinking,
I'm not so sure anymore.

I think we should formulate the semantic first, then do any changes. Also, let's
look at the flag description (which comes from pre-git):

 * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
 * _might_ fail.  This depends upon the particular VM implementation.

So we say it's implementation detail, and IIRC the same is said about which
orders are considered costly and which not, and the associated rules. So, can we
blame callers that happen to use __GFP_REPEAT essentially as a no-op in the
current implementation? And is it a problem that they do that?

So I think we should answer the following questions:

* What is the semantic of __GFP_REPEAT?
  - My suggestion would be something like "I would really like this allocation
to succeed. I still have some fallback but it's so suboptimal I'd rather wait
for this allocation." And then we could e.g. change some heuristics to take that
into account - e.g. direct compaction could ignore the deferred state and
pageblock skip bits, to make sure it's as thorough as possible. Right now, that
sort of happens, but not quite - given enough reclaim/compact attempts, the
compact attempts might break out of deferred state. But pages_reclaimed might
reach 1 << order before compaction "undefers", and then it breaks out of the
loop. Is any such heuristic change possible for reclaim as well?
As part of this question we should also keep in mind/rethink __GFP_NORETRY as
that's supposed to be the opposite flag to __GFP_REPEAT.

* Can it ever happen that __GFP_REPEAT could make some difference for order-0?
  - Certainly not wrt compaction, how about reclaim?
  - If it couldn't possibly affect order-0, then yeah proceed with Patch 1.

* Is PAGE_ALLOC_COSTLY_ORDER considered an implementation detail?
  - I would think so, and if yes, then we probably shouldn't remove
__GFP_NORETRY for order-1+ allocations that happen to be not costly in the
current implementation?




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/