netdev - Re: [PATCH mm] mm: fix BUG with kvzalloc+GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YzFxHlYoncuDl2fM@dhcp22.suse.cz>
Date:   Mon, 26 Sep 2022 11:30:06 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Florian Westphal <fw@...len.de>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org, vbabka@...e.cz,
        akpm@...ux-foundation.org, urezki@...il.com,
        netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
        Martin Zaharinov <micron10@...il.com>
Subject: Re: [PATCH mm] mm: fix BUG with kvzalloc+GFP_ATOMIC

On Mon 26-09-22 10:58:00, Michal Hocko wrote:
[...]
> A better option to me seems to be reworking the rhashtable_insert_rehash
> to not rely on an atomic allocation. I am not familiar with that code
> but it seems to me that the only reason this allocation mode is used is
> due to rcu locking around rhashtable_try_insert. Is there any reason we
> cannot drop the rcu lock, allocate with the full GFP_KERNEL allocation
> power and retry with the pre allocated object? rhashtable_insert_slow is
> already doing that to implement its never fail semantic.

So a very blunt and likely not 100% correct take on this side of things.
But it should give an idea at least.
--- 
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index e12bbfb240b8..fc547c43b05d 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -437,31 +437,11 @@ static void rht_deferred_worker(struct work_struct *work)
 }
 
 static int rhashtable_insert_rehash(struct rhashtable *ht,
-				    struct bucket_table *tbl)
+				    struct bucket_table *tbl,
+				    struct bucket_table *new_tbl)
 {
-	struct bucket_table *old_tbl;
-	struct bucket_table *new_tbl;
-	unsigned int size;
 	int err;
 
-	old_tbl = rht_dereference_rcu(ht->tbl, ht);
-
-	size = tbl->size;
-
-	err = -EBUSY;
-
-	if (rht_grow_above_75(ht, tbl))
-		size *= 2;
-	/* Do not schedule more than one rehash */
-	else if (old_tbl != tbl)
-		goto fail;
-
-	err = -ENOMEM;
-
-	new_tbl = bucket_table_alloc(ht, size, GFP_ATOMIC | __GFP_NOWARN);
-	if (new_tbl == NULL)
-		goto fail;
-
 	err = rhashtable_rehash_attach(ht, tbl, new_tbl);
 	if (err) {
 		bucket_table_free(new_tbl);
@@ -471,17 +451,6 @@ static int rhashtable_insert_rehash(struct rhashtable *ht,
 		schedule_work(&ht->run_work);
 
 	return err;
-
-fail:
-	/* Do not fail the insert if someone else did a rehash. */
-	if (likely(rcu_access_pointer(tbl->future_tbl)))
-		return 0;
-
-	/* Schedule async rehash to retry allocation in process context. */
-	if (err == -ENOMEM)
-		schedule_work(&ht->run_work);
-
-	return err;
 }
 
 static void *rhashtable_lookup_one(struct rhashtable *ht,
@@ -619,9 +588,33 @@ static void *rhashtable_try_insert(struct rhashtable *ht, const void *key,
 		}
 	} while (!IS_ERR_OR_NULL(new_tbl));
 
-	if (PTR_ERR(data) == -EAGAIN)
-		data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?:
+	if (PTR_ERR(data) == -EAGAIN) {
+		struct bucket_table *old_tbl;
+		unsigned int size;
+
+		old_tbl = rht_dereference_rcu(ht->tbl, ht);
+		size = tbl->size;
+
+		data = ERR_PTR(-EBUSY);
+
+		if (rht_grow_above_75(ht, tbl))
+			size *= 2;
+		/* Do not schedule more than one rehash */
+		else if (old_tbl != tbl)
+			return data;
+
+		data = ERR_PTR(-ENOMEM);
+
+		rcu_read_unlock();
+		new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
+		rcu_read_lock();
+
+		if (!new_tbl)
+			return data;
+
+		data = ERR_PTR(rhashtable_insert_rehash(ht, tbl, new_tbl) ?:
 			       -EAGAIN);
+	}
 
 	return data;
 }

-- 
Michal Hocko
SUSE Labs