[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210113000953.GN2743@paulmck-ThinkPad-P72>
Date: Tue, 12 Jan 2021 16:09:53 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: linux-kernel@...r.kernel.org, maz@...nel.org, peterz@...radead.org,
tglx@...utronix.de
Subject: Re: [PATCH 0/2] irq: detect slow IRQ handlers
On Tue, Jan 12, 2021 at 01:59:48PM +0000, Mark Rutland wrote:
> Hi,
>
> While fuzzing arm64 with Syzkaller (under QEMU+KVM) over a number of releases,
> I've occasionally seen some ridiculously long stalls (20+ seconds), where it
> appears that a CPU is stuck in a hard IRQ context. As this gets detected after
> the CPU returns to the interrupted context, it's difficult to identify where
> exactly the stall is coming from.
>
> These patches are intended to help tracking this down, with a WARN() if an IRQ
> handler takes longer than a given timout (1 second by default), logging the
> specific IRQ and handler function. While it's possible to achieve similar with
> tracing, it's harder to integrate that into an automated fuzzing setup.
>
> I've been running this for a short while, and haven't yet seen any of the
> stalls with this applied, but I've tested with smaller timeout periods in the 1
> millisecond range by overloading the host, so I'm confident that the check
> works.
>
> Thanks,
> Mark.
Nice!
Acked-by: Paul E. McKenney <paulmck@...nel.org>
I added the patch below to add a three-second delay to the scheduling
clock interrupt handler. This executed, but did not cause your warning
to be emitted, probably because rcutorture runs under qemu/KVM. So no
Tested-by, not yet, anyway.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e04e336..dac8c7a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2606,6 +2606,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
*/
void rcu_sched_clock_irq(int user)
{
+ static atomic_t invctr;
+
trace_rcu_utilization(TPS("Start scheduler-tick"));
lockdep_assert_irqs_disabled();
raw_cpu_inc(rcu_data.ticks_this_gp);
@@ -2623,6 +2625,14 @@ void rcu_sched_clock_irq(int user)
invoke_rcu_core();
lockdep_assert_irqs_disabled();
+ if (atomic_inc_return(&invctr) % 0x3ffff == 0) {
+ int i;
+
+ pr_alert("%s: 3-second delay.\n", __func__);
+ for (i = 0; i < 3000; i++)
+ udelay(1000);
+ }
+
trace_rcu_utilization(TPS("End scheduler-tick"));
}
Powered by blists - more mailing lists