[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190123211758.104275-1-jelsasser@appneta.com>
Date: Wed, 23 Jan 2019 13:17:58 -0800
From: Josh Elsasser <jelsasser@...neta.com>
To: "David S . Miller" <davem@...emloft.net>
Cc: josh@...asser.ca, Josh Elsasser <jelsasser@...neta.com>,
Thomas Graf <tgraf@...g.ch>,
Herbert Xu <herbert@...dor.apana.org.au>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH net] rhashtable: avoid reschedule loop after rapid growth and shrink
When running workloads with large bursts of fragmented packets, we've seen
a few machines stuck returning -EEXIST from rht_shrink() and endlessly
rescheduling their hash table's deferred work, pegging a CPU core.
Root cause is commit da20420f83ea ("rhashtable: Add nested tables"), which
stops ignoring the return code of rhashtable_shrink() and the reallocs
used to grow the hashtable. This uncovers a bug in the shrink logic where
"needs to shrink" check runs against the last table but the actual shrink
operation runs on the first bucket_table in the hashtable (see below):
+-------+ +--------------+ +---------------+
| ht | | "first" tbl | | "last" tbl |
| - tbl ---> | - future_tbl ---------> | - future_tbl ---> NULL
+-------+ +--------------+ +---------------+
^^^ ^^^
used by rhashtable_shrink() used by rht_shrink_below_30()
A rehash then stalls out when both the last table needs to shrink, the
first table has more elements than the target size, but rht_shrink() hits
a non-NULL future_tbl and returns -EEXIST. This skips the item rehashing
and kicks off a reschedule loop, as no forward progress can be made while
the rhashtable needs to shrink.
Extend rhashtable_shrink() with a "tbl" param to avoid endless exit-and-
reschedules after hitting the EEXIST, allowing it to check a future_tbl
pointer that can actually be non-NULL and make forward progress when the
hashtable needs to shrink.
Fixes: da20420f83ea ("rhashtable: Add nested tables")
Signed-off-by: Josh Elsasser <jelsasser@...neta.com>
---
lib/rhashtable.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 852ffa5160f1..98e91f9544fa 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -377,9 +377,9 @@ static int rhashtable_rehash_alloc(struct rhashtable *ht,
* It is valid to have concurrent insertions and deletions protected by per
* bucket locks or concurrent RCU protected lookups and traversals.
*/
-static int rhashtable_shrink(struct rhashtable *ht)
+static int rhashtable_shrink(struct rhashtable *ht,
+ struct bucket_table *old_tbl)
{
- struct bucket_table *old_tbl = rht_dereference(ht->tbl, ht);
unsigned int nelems = atomic_read(&ht->nelems);
unsigned int size = 0;
@@ -412,7 +412,7 @@ static void rht_deferred_worker(struct work_struct *work)
if (rht_grow_above_75(ht, tbl))
err = rhashtable_rehash_alloc(ht, tbl, tbl->size * 2);
else if (ht->p.automatic_shrinking && rht_shrink_below_30(ht, tbl))
- err = rhashtable_shrink(ht);
+ err = rhashtable_shrink(ht, tbl);
else if (tbl->nest)
err = rhashtable_rehash_alloc(ht, tbl, tbl->size);
--
2.19.1
Powered by blists - more mailing lists