[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d4bbae19-065c-47bd-9493-366aa98d4e6f@paulmck-laptop>
Date: Tue, 15 Apr 2025 20:55:34 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
rostedt@...dmis.org
Subject: Re: [PATCH v2 05/12] rcutorture: Add tests for SRCU up/down reader
primitives
On Tue, Apr 15, 2025 at 09:14:36PM -0400, Joel Fernandes wrote:
>
>
> On 4/15/2025 5:15 PM, Paul E. McKenney wrote:
> > On Tue, Apr 15, 2025 at 10:59:36AM -0700, Paul E. McKenney wrote:
> >> On Tue, Apr 15, 2025 at 01:16:15PM -0400, Joel Fernandes wrote:
> >>>
> >>>
> >>> On 3/31/2025 5:03 PM, Paul E. McKenney wrote:
> >>>> This commit adds a new rcutorture.n_up_down kernel boot parameter
> >>>> that specifies the number of outstanding SRCU up/down readers, which
> >>>> begin in kthread context and end in an hrtimer handler. There is a new
> >>>> kthread ("rcu_torture_updown") that scans an per-reader array looking
> >>>> for elements whose readers have ended. This kthread sleeps between one
> >>>> and two milliseconds between consecutive scans.
> >>>>
> >>>> [ paulmck: Apply kernel test robot feedback. ]
> >>>> [ paulmck: Apply Z qiang feedback. ]
> >>>>
> >>>> Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> >>>
> >>> For completeness, posting our discussion for the archives, an issue exists in
> >>> this patch causing the following errors on an ARM64 machine with 288 CPUs:
> >>>
> >>> When running SRCU-P test, we intermittently see:
> >>>
> >>> [ 9500.806108] ??? Writer stall state RTWS_SYNC(21) g18446744073709551218 f0x0
> >>> ->state 0x2 cpu 4
> >>> [ 9515.833356] ??? Writer stall state RTWS_SYNC(21) g18446744073709551218 f0x0
> >>> ->state 0x2 cpu 4
> >>>
> >>> It bisected to just this patch.
> >>
> >> Looks like your getting rcutorture running on ARM was well timed!
>
> Yes! Glad I could help.
>
> > And could you please send along your dmesg and .config files?
>
> Sure, attached both for one of the failed runs.
Thank you! That did answer at least one of my questions. It also showed
the need for the diff below. :-/
As in kvm.sh and friends might well be missing failures in your runs.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh b/tools/testing/selftests/rcutorture/bin/console-badness.sh
index aad51e7c0183d..991fb11306eb6 100755
--- a/tools/testing/selftests/rcutorture/bin/console-badness.sh
+++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh
@@ -10,7 +10,7 @@
#
# Authors: Paul E. McKenney <paulmck@...nel.org>
-grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
+grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Call trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
grep -v 'ODEBUG: ' |
grep -v 'This means that this is a DEBUG kernel and it is' |
grep -v 'Warning: unable to open an initial console' |
diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh b/tools/testing/selftests/rcutorture/bin/parse-console.sh
index b07c11cf6929d..21e6ba3615f6a 100755
--- a/tools/testing/selftests/rcutorture/bin/parse-console.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh
@@ -148,7 +148,7 @@ then
summary="$summary KCSAN: $n_kcsan"
fi
fi
- n_calltrace=`grep -c 'Call Trace:' $file`
+ n_calltrace=`grep -Ec 'Call Trace:|Call trace:' $file`
if test "$n_calltrace" -ne 0
then
summary="$summary Call Traces: $n_calltrace"
Powered by blists - more mailing lists