linux-kernel - Re: [RFC 2/2] mm, oom: do not enfore OOM killer for __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <75ad0d8e-3dff-8893-eb2d-5f3817d91d83@suse.cz>
Date:   Thu, 24 Nov 2016 08:41:30 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-mm@...ck.org,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        David Rientjes <rientjes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL
 automatically

On 11/23/2016 01:35 PM, Michal Hocko wrote:
> On Wed 23-11-16 13:19:20, Vlastimil Babka wrote:
>> This makes some sense to me, but there might be unpleasant consequences,
>> e.g. due to allowing costly allocations without reserves.
>
> I am not sure I understand. Did you mean with reserves? Anyway, my code

Yeah, with reserves/without watermarks checks. Sorry.

> inspection shown that we are not really doing GFP_NOFAIL for costly
> orders. This might change in the future but even if we do that then this
> shouldn't add a risk of the reserves depletion, right?

Well it's true that it will be unlikely that high-order pages will exist 
at min watermark, but if they do, high-order page depletes more than 
order-0. Anyway we have the WARN_ON_ONCE on cosly nofail allocations, so 
at least this won't happen silently...

>> I guess only testing will show...
>>
>> Also some comments below.
> [...]
>>>  static inline struct page *
>>> +__alloc_pages_nowmark(gfp_t gfp_mask, unsigned int order,
>>> +						const struct alloc_context *ac)
>>> +{
>>> +	struct page *page;
>>> +
>>> +	page = get_page_from_freelist(gfp_mask, order,
>>> +			ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac);
>>> +	/*
>>> +	 * fallback to ignore cpuset restriction if our nodes
>>> +	 * are depleted
>>> +	 */
>>> +	if (!page)
>>> +		page = get_page_from_freelist(gfp_mask, order,
>>> +				ALLOC_NO_WATERMARKS, ac);
>>
>> Is this enough? Look at what __alloc_pages_slowpath() does since
>> e46e7b77c909 ("mm, page_alloc: recalculate the preferred zoneref if the
>> context can ignore memory policies").
>
> this is a one time attempt to do the nowmark allocation. If we need to
> do the recalculation then this should happen in the next round. Or am I
> missing your question?

The next round no-watermarks allocation attempt in 
__alloc_pages_slowpath() uses different criteria than the new 
__alloc_pages_nowmark() callers. And it would be nicer to unify this as 
well, if possible.

>
>>
>> ...
>>
>>> -	}
>>>  	/* Exhausted what can be done so it's blamo time */
>>> -	if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
>>> +	if (out_of_memory(&oc)) {
>>
>> This removes the warning, but also the check for __GFP_NOFAIL itself. Was it
>> what you wanted?
>
> The point of the check was to keep looping for __GFP_NOFAIL requests
> even when the OOM killer is disabled (out_of_memory returns false). We
> are accomplishing that by
>>
>>>  		*did_some_progress = 1;
> 		^^^^ this

But oom disabled means that this line is not reached?

> it is true we will not have the warning but I am not really sure we care
> all that much. In any case it wouldn't be all that hard to check for oom
> killer disabled and warn on in the allocator slow path.
>
> thanks for having a look!
>