[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240216194329.840555-1-mcgrof@kernel.org>
Date: Fri, 16 Feb 2024 11:43:29 -0800
From: Luis Chamberlain <mcgrof@...nel.org>
To: akpm@...ux-foundation.org,
willy@...radead.org
Cc: linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
gost.dev@...sung.com,
p.raghav@...sung.com,
da.gomez@...sung.com,
mcgrof@...nel.org,
kernel test robot <oliver.sang@...el.com>
Subject: [PATCH] test_xarray: fix soft lockup for advanced-api tests
The new adanced API tests want to vet the xarray API is doing what it
promises by manually iterating over a set of possible indexes on its
own, and using a query operation which holds the RCU lock and then
releases it. So it is not using the helper loop options which xarray
provides on purpose. Any loop which iterates over 1 million entries
(which is possible with order 20, so emulating say a 4 GiB block size)
to just to rcu lock and unlock will eventually end up triggering a soft
lockup on systems which don't preempt, and have lock provin and RCU
prooving enabled.
xarray users already use XA_CHECK_SCHED for loops which may take a long
time, in our case we don't want to RCU unlock and lock as the caller
does that already, but rather just force a schedule every XA_CHECK_SCHED
iterations since the test is trying to not trust and rather test that
xarray is doing the right thing.
[0] https://lkml.kernel.org/r/202402071613.70f28243-lkp@intel.com
Reported-by: kernel test robot <oliver.sang@...el.com>
Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
---
lib/test_xarray.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index d4e55b4867dc..ac162025cc59 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -781,6 +781,7 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
{
XA_STATE(xas, xa, index);
void *p;
+ static unsigned int i = 0;
rcu_read_lock();
repeat:
@@ -790,6 +791,17 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
goto repeat;
rcu_read_unlock();
+ /*
+ * This is not part of the page cache, this selftest is pretty
+ * aggressive and does not want to trust the xarray API but rather
+ * test it, and for order 20 (4 GiB block size) we can loop over
+ * over a million entries which can cause a soft lockup. Page cache
+ * APIs won't be stupid, proper page cache APIs loop over the proper
+ * order so when using a larger order we skip shared entries.
+ */
+ if (++i % XA_CHECK_SCHED == 0)
+ schedule();
+
return p;
}
--
2.42.0
Powered by blists - more mailing lists