linux-kernel - Re: [PATCH mm] mm: fix BUG with kvzalloc+GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YzFplwSxwwsLpzzX@dhcp22.suse.cz>
Date:   Mon, 26 Sep 2022 10:57:59 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Florian Westphal <fw@...len.de>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org, vbabka@...e.cz,
        akpm@...ux-foundation.org, urezki@...il.com,
        netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
        Martin Zaharinov <micron10@...il.com>
Subject: Re: [PATCH mm] mm: fix BUG with kvzalloc+GFP_ATOMIC

On Mon 26-09-22 09:56:39, Florian Westphal wrote:
> Michal Hocko <mhocko@...e.com> wrote:
> > > kvzalloc(GFP_ATOMIC) was perfectly fine, is this illegal again?
> > 
> > kvmalloc has never really supported GFP_ATOMIC semantic.
> 
> It did, you added it:
> ce91f6ee5b3b ("mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags")

Yes, I am very well aware of this commit and I have to say I wasn't
really supper happy about it TBH. Linus has argued this will result in a
saner code and in some cases this was true.

Later on we really had to add support some extensions beyond
GFP_KERNEL. Your change would break those GFP_NOFAIL and NOFS
usecases. GFP_NOWAIT and GFP_ATOMIC are explicitly documented as
unsupported. One we can do to continue in ce91f6ee5b3b sense is to
do this instead

diff --git a/mm/util.c b/mm/util.c
index 0837570c9225..a27b3fce1f0e 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -618,6 +618,10 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
 	 */
 	if (ret || size <= PAGE_SIZE)
 		return ret;
+	
+	/* non-sleeping allocations are not supported by vmalloc */
+	if (!gfpflags_allow_blocking(flags))
+		return NULL;
 
 	/* Don't even allow crazy sizes */
 	if (unlikely(size > INT_MAX)) {

A better option to me seems to be reworking the rhashtable_insert_rehash
to not rely on an atomic allocation. I am not familiar with that code
but it seems to me that the only reason this allocation mode is used is
due to rcu locking around rhashtable_try_insert. Is there any reason we
cannot drop the rcu lock, allocate with the full GFP_KERNEL allocation
power and retry with the pre allocated object? rhashtable_insert_slow is
already doing that to implement its never fail semantic.
-- 
Michal Hocko
SUSE Labs