[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86bkqx6wrd.wl-maz@kernel.org>
Date: Fri, 30 Sep 2022 10:23:50 +0100
From: Marc Zyngier <maz@...nel.org>
To: Zhang Xincheng <zhangxincheng@...ontech.com>
Cc: tglx@...utronix.de, linux-kernel@...r.kernel.org,
oleksandr@...alenko.name, hdegoede@...hat.com,
bigeasy@...utronix.de, mark.rutland@....com, michael@...le.cc
Subject: Re: [PATCH] interrupt: discover and disable very frequent interrupts
On Fri, 30 Sep 2022 07:40:42 +0100,
Zhang Xincheng <zhangxincheng@...ontech.com> wrote:
>
> From: zhangxincheng <zhangxincheng@...ontech.com>
>
> In some cases, a peripheral's interrupt will be triggered frequently,
> which will keep the CPU processing the interrupt and eventually cause
> the RCU to report rcu_sched self-detected stall on the CPU.
>
> [ 838.131628] rcu: INFO: rcu_sched self-detected stall on CPU
> [ 838.137189] rcu: 0-....: (194839 ticks this GP) idle=f02/1/0x4000000000000004
> softirq=9993/9993 fqs=97428
> [ 838.146912] rcu: (t=195015 jiffies g=6773 q=0)
> [ 838.151516] Task dump for CPU 0:
> [ 838.154730] systemd-sleep R running task 0 3445 1 0x0000000a
>
> Signed-off-by: zhangxincheng <zhangxincheng@...ontech.com>
> Change-Id: I9c92146f2772eae383c16c8c10de028b91e07150
> Signed-off-by: zhangxincheng <zhangxincheng@...ontech.com>
Irrespective of the patch itself, I would really like to understand
why you consider that it is a better course of action to kill a device
(and potentially the whole machine) than to let the storm eventually
calm down? A frequent interrupt is not necessarily the sign of
something going wrong. It is the sign of a busy system. I prefer my
systems busy rather than dead.
Furthermore, I see no rationale here about the number of interrupt
that *you* consider as being "too many" over what period of time (it
seems to me that both parameters are firmly hardcoded).
Something like this should be limited to a debug feature. It would
also be a lot more useful if it was built as an interrupt *limiting*
feature, rather then killing the interrupt forever (which is IMHO a
ludicrous thing to do).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists