lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=Xr8-rzANK8oKyEZpk1FZ2dy5HqBXKpk3O29PhG1fRowg@mail.gmail.com>
Date: Mon, 1 Apr 2024 09:41:38 -0700
From: Doug Anderson <dianders@...omium.org>
To: Bitao Hu <yaoma@...ux.alibaba.com>, Thomas Gleixner <tglx@...utronix.de>, pmladek@...e.com
Cc: liusong@...ux.alibaba.com, akpm@...ux-foundation.org, kernelfans@...il.com, 
	deller@....de, npiggin@...il.com, tsbogend@...ha.franken.de, 
	James.Bottomley@...senpartnership.com, jan.kiszka@...mens.com, 
	linux-kernel@...r.kernel.org, linux-mips@...r.kernel.org, 
	linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCHv12 4/4] watchdog/softlockup: report the most frequent interrupts

Hi,

On Mon, Mar 25, 2024 at 2:48 AM Bitao Hu <yaoma@...ux.alibaba.com> wrote:
>
> Hi, Thomas
>
> On 2024/3/24 04:43, Thomas Gleixner wrote:
> > On Wed, Mar 06 2024 at 20:52, Bitao Hu wrote:
> >> +    if (__this_cpu_read(snapshot_taken)) {
> >> +            for_each_active_irq(i) {
> >> +                    count = kstat_get_irq_since_snapshot(i);
> >> +                    tabulate_irq_count(irq_counts_sorted, i, count, NUM_HARDIRQ_REPORT);
> >> +            }
> >> +
> >> +            /*
> >> +             * We do not want the "watchdog: " prefix on every line,
> >> +             * hence we use "printk" instead of "pr_crit".
> >> +             */
> >
> > You are not providing any justification why the prefix is not
> > wanted. Just saying 'We do not want' does not cut it and who is 'We'. I
> > certainly not.
> >
> > I really disagree because the prefixes are very useful for searching log
> > files. So not having it makes it harder to filter out for no reason.
> >
>
>
> Regarding the use of printk() instead of pr_crit(), I have had a
> discussion with Liu Song and Douglas in PATCHv1:
> https://lore.kernel.org/all/CAD=FV=WEEQeKX=ec3Gr-8CKs2K0MaWN3V0-0yOsuret0qcB_AA@mail.gmail.com/
>
> Please allow me to elaborate on my reasoning. The purpose of the
> report_cpu_status() function I implemented is similar to that of
> print_modules(), show_regs(), and dump_stack(). These functions are
> designed to assist in analyzing the causes of a soft lockup, rather
> than to report that a soft lockup has occurred. Therefore, I think
> that adding the "watchdog: " prefix to every line is redundant and
> not concise. Besides, the existing pr_emerg() in the watchdog.c file
> is already sufficient for searching useful information in the logs.
> The information I added, along with the call tree and other data, is
> located near the line with the "watchdog: " prefix.
>
> Are the two reasons I've provided reasonable?

FWIW I don't feel super strongly about this, but I'm leaning towards
agreeing with Bitao. The sample output from the commit message looks
like this:

[  638.870231] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [swapper/9:0]
[  638.870825] CPU#9 Utilization every 4s during lockup:
[  638.871194]  #1:   0% system,          0% softirq,   100% hardirq,
   0% idle
[  638.871652]  #2:   0% system,          0% softirq,   100% hardirq,
   0% idle
[  638.872107]  #3:   0% system,          0% softirq,   100% hardirq,
   0% idle
[  638.872563]  #4:   0% system,          0% softirq,   100% hardirq,
   0% idle
[  638.873018]  #5:   0% system,          0% softirq,   100% hardirq,
   0% idle
[  638.873494] CPU#9 Detect HardIRQ Time exceeds 50%. Most frequent HardIRQs:
[  638.873994]  #1: 330945      irq#7
[  638.874236]  #2: 31          irq#82
[  638.874493]  #3: 10          irq#10
[  638.874744]  #4: 2           irq#89
[  638.874992]  #5: 1           irq#102

..and in my mind the "watchdog: BUG: soft lockup - CPU#9 stuck for
26s! [swapper/9:0]" line is enough to grep through the dmesg. Having
all the following lines start with "watchdog:" feels like overkill to
me, but if you feel strongly that they should then it wouldn't bother
me too much for them all to have the "watchdog:" prefix.

Could you clarify how strongly you feel about this and whether Bitao
should spin a v13?

I believe that this is the only point of contention on the patch
series right now and otherwise it could be ready to land. I know in
the past Petr has wanted ample time to comment though perhaps the fact
that it's been ~1 month is enough. Petr: do you have anything that
needs saying before this patch series lands?

Thanks!

-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ