lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=Xgr4--NJ3dAh2ggxbFUV9-QR6rW+YXyMHZYXPVSkmaAw@mail.gmail.com>
Date: Wed, 28 Feb 2024 14:44:58 -0800
From: Doug Anderson <dianders@...omium.org>
To: Bitao Hu <yaoma@...ux.alibaba.com>
Cc: tglx@...utronix.de, liusong@...ux.alibaba.com, akpm@...ux-foundation.org, 
	pmladek@...e.com, kernelfans@...il.com, deller@....de, npiggin@...il.com, 
	tsbogend@...ha.franken.de, James.Bottomley@...senpartnership.com, 
	jan.kiszka@...mens.com, linux-kernel@...r.kernel.org, 
	linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org, 
	linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCHv11 4/4] watchdog/softlockup: report the most frequent interrupts

Hi,

On Tue, Feb 27, 2024 at 11:22 PM Bitao Hu <yaoma@...ux.alibaba.com> wrote:
>
> When the watchdog determines that the current soft lockup is due
> to an interrupt storm based on CPU utilization, reporting the
> most frequent interrupts could be good enough for further
> troubleshooting.
>
> Below is an example of interrupt storm. The call tree does not
> provide useful information, but we can analyze which interrupt
> caused the soft lockup by comparing the counts of interrupts.
>
> [  638.870231] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [swapper/9:0]
> [  638.870825] CPU#9 Utilization every 4s during lockup:
> [  638.871194]  #1:   0% system,          0% softirq,   100% hardirq,     0% idle
> [  638.871652]  #2:   0% system,          0% softirq,   100% hardirq,     0% idle
> [  638.872107]  #3:   0% system,          0% softirq,   100% hardirq,     0% idle
> [  638.872563]  #4:   0% system,          0% softirq,   100% hardirq,     0% idle
> [  638.873018]  #5:   0% system,          0% softirq,   100% hardirq,     0% idle
> [  638.873494] CPU#9 Detect HardIRQ Time exceeds 50%. Most frequent HardIRQs:
> [  638.873994]  #1: 330945      irq#7
> [  638.874236]  #2: 31          irq#82
> [  638.874493]  #3: 10          irq#10
> [  638.874744]  #4: 2           irq#89
> [  638.874992]  #5: 1           irq#102
> ...
> [  638.875313] Call trace:
> [  638.875315]  __do_softirq+0xa8/0x364
>
> Signed-off-by: Bitao Hu <yaoma@...ux.alibaba.com>
> Reviewed-by: Liu Song <liusong@...ux.alibaba.com>
> ---
>  kernel/watchdog.c | 115 ++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 111 insertions(+), 4 deletions(-)

Reviewed-by: Douglas Anderson <dianders@...omium.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ