linux-kernel - Re: [PATCH 0/3 -v3] GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170103084211.GB30111@dhcp22.suse.cz>
Date:   Tue, 3 Jan 2017 09:42:12 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org, rientjes@...gle.com,
        mgorman@...e.de, hillf.zj@...baba-inc.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3 -v3] GFP_NOFAIL cleanups

On Tue 03-01-17 10:36:31, Tetsuo Handa wrote:
[...]
> I'm OK with "[PATCH 1/3] mm: consolidate GFP_NOFAIL checks in the allocator
> slowpath" given that we describe that we make __GFP_NOFAIL stronger than
> __GFP_NORETRY with this patch in the changelog.

Again. __GFP_NORETRY | __GFP_NOFAIL is nonsense! I do not really see any
reason to describe all the nonsense combinations of gfp flags.

> But I don't think "[PATCH 2/3] mm, oom: do not enfore OOM killer for __GFP_NOFAIL
> automatically" is correct. Firstly, we need to confirm
> 
>   "The pre-mature OOM killer is a real issue as reported by Nils Holland"
> 
> in the changelog is still true because we haven't tested with "[PATCH] mm, memcg:
> fix the active list aging for lowmem requests when memcg is enabled" applied and
> without "[PATCH 2/3] mm, oom: do not enfore OOM killer for __GFP_NOFAIL
> automatically" and "[PATCH 3/3] mm: help __GFP_NOFAIL allocations which do not
> trigger OOM killer" applied.

Yes I have dropped the reference to this report already in my local
patch because in this particular case the issue was somewhere else
indeed!

> Secondly, as you are using __GFP_NORETRY in "[PATCH] mm: introduce kv[mz]alloc
> helpers" as a mean to enforce not to invoke the OOM killer
> 
> 	/*
> 	 * Make sure that larger requests are not too disruptive - no OOM
> 	 * killer and no allocation failure warnings as we have a fallback
> 	 */
> 	if (size > PAGE_SIZE)
> 		kmalloc_flags |= __GFP_NORETRY | __GFP_NOWARN;
> 
> , we can use __GFP_NORETRY as a mean to enforce not to invoke the OOM killer
> rather than applying "[PATCH 2/3] mm, oom: do not enfore OOM killer for
> __GFP_NOFAIL automatically".
> 
> Additionally, although currently there seems to be no
> kv[mz]alloc(GFP_KERNEL | __GFP_NOFAIL) users, kvmalloc_node() in
> "[PATCH] mm: introduce kv[mz]alloc helpers" will be confused when a
> kv[mz]alloc(GFP_KERNEL | __GFP_NOFAIL) user comes in in the future because
> "[PATCH 1/3] mm: consolidate GFP_NOFAIL checks in the allocator slowpath" makes
> __GFP_NOFAIL stronger than __GFP_NORETRY.

Using NOFAIL in kv[mz]alloc simply makes no sense at all. The vmalloc
fallback would be simply unreachable!

> My concern with "[PATCH 3/3] mm: help __GFP_NOFAIL allocations which
> do not trigger OOM killer" is
> 
>   "AFAIU, this is an allocation path which doesn't block a forward progress
>    on a regular IO. It is merely a check whether there is a new medium in
>    the CDROM (aka regular polling of the device). I really fail to see any
>    reason why this one should get any access to memory reserves at all."
> 
> in http://lkml.kernel.org/r/20161218163727.GC8440@dhcp22.suse.cz .
> Indeed that trace is a __GFP_DIRECT_RECLAIM and it might not be blocking
> other workqueue items which a regular I/O depend on, I think there are
> !__GFP_DIRECT_RECLAIM memory allocation requests for issuing SCSI commands
> which could potentially start failing due to helping GFP_NOFS | __GFP_NOFAIL
> allocations with memory reserves. If a SCSI disk I/O request fails due to
> GFP_ATOMIC memory allocation failures because we allow a FS I/O request to
> use memory reserves, it adds a new problem.

Do you have any example of such a request? Anything that requires
a forward progress during IO should be using mempools otherwise it
is broken pretty much by design already. Also IO depending on NOFS
allocations sounds pretty much broken already. So I suspect the above
reasoning is just bogus.

That being said, to summarize your arguments again. 1) you do not like
that a combination of __GFP_NORETRY | __GFP_NOFAIL is not documented
to never fail, 2) based on that you argue that kv[mvz]alloc with
__GFP_NOFAIL will never reach vmalloc and 3) that there might be some IO
paths depending on NOFS|NOFAIL allocation which would have harder time
to make forward progress.

I would call 1 and 2 just bogus and 3 highly dubious at best. Do not
get me wrong but this is not what I call a useful review feedback yet
alone a reason to block these patches. If there are any reasons to not
merge them these are not those.

-- 
Michal Hocko
SUSE Labs