lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 7 Feb 2024 14:18:38 +0800
From: Bitao Hu <yaoma@...ux.alibaba.com>
To: Doug Anderson <dianders@...omium.org>
Cc: akpm@...ux-foundation.org, pmladek@...e.com, kernelfans@...il.com,
 liusong@...ux.alibaba.com, linux-kernel@...r.kernel.org,
 yaoma@...ux.alibaba.com
Subject: Re: [PATCHv5 1/3] watchdog/softlockup: low-overhead detection of
 interrupt

Hi,

On 2024/2/7 05:41, Doug Anderson wrote:
> Hi,
> 
> On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@...ux.alibaba.com> wrote:
>>
>> The following softlockup is caused by interrupt storm, but it cannot be
>> identified from the call tree. Because the call tree is just a snapshot
>> and doesn't fully capture the behavior of the CPU during the soft lockup.
>>    watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
>>    ...
>>    Call trace:
>>      __do_softirq+0xa0/0x37c
>>      __irq_exit_rcu+0x108/0x140
>>      irq_exit+0x14/0x20
>>      __handle_domain_irq+0x84/0xe0
>>      gic_handle_irq+0x80/0x108
>>      el0_irq_naked+0x50/0x58
>>
>> Therefore,I think it is necessary to report CPU utilization during the
>> softlockup_thresh period (report once every sample_period, for a total
>> of 5 reportings), like this:
>>    watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
>>    CPU#28 Utilization every 4s during lockup:
>>      #1: 0% system, 0% softirq, 100% hardirq, 0% idle
>>      #2: 0% system, 0% softirq, 100% hardirq, 0% idle
>>      #3: 0% system, 0% softirq, 100% hardirq, 0% idle
>>      #4: 0% system, 0% softirq, 100% hardirq, 0% idle
>>      #5: 0% system, 0% softirq, 100% hardirq, 0% idle
>>    ...
>>
>> This would be helpful in determining whether an interrupt storm has
>> occurred or in identifying the cause of the softlockup. The criteria for
>> determination are as follows:
>>    a. If the hardirq utilization is high, then interrupt storm should be
>>    considered and the root cause cannot be determined from the call tree.
>>    b. If the softirq utilization is high, then we could analyze the call
>>    tree but it may cannot reflect the root cause.
>>    c. If the system utilization is high, then we could analyze the root
>>    cause from the call tree.
>>
>> Signed-off-by: Bitao Hu <yaoma@...ux.alibaba.com>
>> ---
>>   kernel/watchdog.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 89 insertions(+)
> 
> On v4 you got Liu Song's Reviewed-by and I don't think this is
> massively different than v4. I would have expected you to carry the
> tag forward. In any case ,I guess Liu Song can give it again.. >
> 
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 81a8862295d6..71d5b6dfa358 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -16,6 +16,8 @@
>>   #include <linux/cpu.h>
>>   #include <linux/nmi.h>
>>   #include <linux/init.h>
>> +#include <linux/kernel_stat.h>
>> +#include <linux/math64.h>
>>   #include <linux/module.h>
>>   #include <linux/sysctl.h>
>>   #include <linux/tick.h>
>> @@ -333,6 +335,90 @@ __setup("watchdog_thresh=", watchdog_thresh_setup);
>>
>>   static void __lockup_detector_cleanup(void);
>>
>> +#ifdef CONFIG_IRQ_TIME_ACCOUNTING
>> +#define NUM_STATS_GROUPS       5
>> +#define NUM_STATS_PER_GROUP    4
>> +enum stats_per_group {
>> +       STATS_SYSTEM,
>> +       STATS_SOFTIRQ,
>> +       STATS_HARDIRQ,
>> +       STATS_IDLE,
> 
> nit: I still would have left "NUM_STATS_PER_GROUP" here instead of as
> a separate #define.
OK.
> 
> 
>> +static void print_cpustat(void)
>> +{
>> +       int i, group;
>> +       u8 tail = __this_cpu_read(cpustat_tail);
> 
> Sorry for not noticing before, but why are you using
> "__this_cpu_read()" instead of "this_cpu_read()"? In other words, why
> do you need the double-underscore version everywhere? I don't think
> you do, do you?
I also struggled with which version of the operation to use. The one
without double-underscores provides preemption/interrupt protection,
but in watchdog.c, the version with double-underscores is used. I
analyzed that it is also safe to use the version without
preemption/interrupt protection in my code, so to maintain consistency
with watchdog.c, I ues the version with double-underscores.

Is my approach reasonable? If not, I will switch to using the
non-underscored version.
> 
> 
>> +       u64 sample_period_second = sample_period;
>> +
>> +       do_div(sample_period_second, NSEC_PER_SEC);
>> +       /*
>> +        * We do not want the "watchdog: " prefix on every line,
>> +        * hence we use "printk" instead of "pr_crit".
>> +        */
>> +       printk(KERN_CRIT "CPU#%d Utilization every %llus during lockup:\n",
>> +               smp_processor_id(), sample_period_second);
>> +       for (i = 0; i < NUM_STATS_GROUPS; i++) {
>> +               group = (tail + i) % NUM_STATS_GROUPS;
>> +               printk(KERN_CRIT "\t#%d: %3u%% system,\t%3u%% softirq,\t"
>> +                       "%3u%% hardirq,\t%3u%% idle\n", i+1,
> 
> nit: though I don't care too much in this case, I think kernel folks
> slightly prefer "i + 1" instead of "i+1". Running
> "./scripts/checkpatch.pl --strict" will give a warning about this, for
> instance. Actually, "./scripts/checkpatch.pl --strict" has a few extra
> style nits that you could consider fixing.
Thanks for your reminder. I will use "./scripts/checkpatch.pl --strict"
to check and correct these patches.
> 
> 
>> +static void report_cpu_status(void)
>> +{
>> +       print_cpustat();
>> +}
> 
> I don't understand why you need the extra wrapper. You didn't have it
> on v3 and I don't see any reason why you introduced it. Ah, I see, in
> the next patch you add something to it. OK, I guess it's fine to
> introduce it here.
Yes, I add this wrapper to prepare for the next patch, to avoid
predeclaring of "print_irq_counts".


> 
> -Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ