linux-kernel - Re: [RFC 1/3] mm, oom: refactor oom detection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151030145539.GF23627@dhcp22.suse.cz>
Date:	Fri, 30 Oct 2015 15:55:39 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	hillf.zj@...baba-inc.com, linux-mm@...ck.org,
	akpm@...ux-foundation.org, torvalds@...ux-foundation.org,
	mgorman@...e.de, hannes@...xchg.org, riel@...hat.com,
	rientjes@...gle.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC 1/3] mm, oom: refactor oom detection

On Fri 30-10-15 22:32:27, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > +		target -= (stall_backoff * target + MAX_STALL_BACKOFF - 1) / MAX_STALL_BACKOFF;
> target -= DIV_ROUND_UP(stall_backoff * target, MAX_STALL_BACKOFF);

Ohh, we have a macro for that. Good to know. Thanks. It sure looks much
easier to follow.
 
> Michal Hocko wrote:
> > This alone wouldn't be sufficient, though, because the writeback might
> > get stuck and reclaimable pages might be pinned for a really long time
> > or even depend on the current allocation context.
> 
> Is this a dependency which I worried at
> http://lkml.kernel.org/r/201510262044.BAI43236.FOMSFFOtOVLJQH@I-love.SAKURA.ne.jp ?

Yes, I had restricted allocation contexts in mind here.

> >                                                   Therefore there is a
> > feedback mechanism implemented which reduces the reclaim target after
> > each reclaim round without any progress.
> 
> If yes, this feedback mechanism will help avoiding infinite wait loop.
> 
> >                                          This means that we should
> > eventually converge to only NR_FREE_PAGES as the target and fail on the
> > wmark check and proceed to OOM.
> 
> What if all in-flight allocation requests are !__GFP_NOFAIL && !__GFP_FS ?

Then we will loop like crazy hoping that _something_ will reclaim memory
for us. Same as we do now.

> (In other words, either "no __GFP_FS allocations are in-flight" or "all
> __GFP_FS allocations are in-flight but are either waiting for completion
> of operations which depend on !__GFP_FS allocations with a lock held or
> waiting for that lock to be released".)
> 
> Don't we need to call out_of_memory() even though !__GFP_FS allocations?

I do not think this is in scope of this patch series. I am trying to
normalize the OOM detection and GFP_FS is a separate beast and we do not
have enough counters to decide the whether OOM killer would be
premature or not (e.g. we do not know how much memory is unreclaimable
just because of NOFS context). I am convinced that GFP_FS simply has to
fail the allocation as I've suggested quite some time ago and plan to
revisit it soon(ish). I consider the two orthogonal.

Thanks!
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/