[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3f4b7dbe-93c1-4cb0-8233-18e8432409f8@linux.alibaba.com>
Date: Fri, 23 Feb 2024 15:18:48 +0800
From: Bitao Hu <yaoma@...ux.alibaba.com>
To: Thomas Gleixner <tglx@...utronix.de>, dianders@...omium.org,
akpm@...ux-foundation.org, liusong@...ux.alibaba.com, pmladek@...e.com,
kernelfans@...il.com, deller@....de, npiggin@...il.com,
tsbogend@...ha.franken.de, James.Bottomley@...senPartnership.com,
jan.kiszka@...mens.com
Cc: linux-kernel@...r.kernel.org, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
yaoma@...ux.alibaba.com
Subject: Re: [PATCHv9 2/3] irq: use a struct for the kstat_irqs in the
interrupt descriptor
Hi,
On 2024/2/22 21:22, Thomas Gleixner wrote:
> On Thu, Feb 22 2024 at 17:34, Bitao Hu wrote:
>
> First of all the subsystem prefix is 'genirq:'. 'git log kernel/irq/'
> gives you a pretty good hint. It's documented....
>
> Secondly the subject line does not match what this patch is about. It's
> not about using a struct, it's about providing a snapshot mechanism, no?
>
>> The current implementation uses an int for the kstat_irqs in the
>> interrupt descriptor.
>>
>> However, we need to know the number of interrupts which happened
>> since softlockup detection took a snapshot in order to analyze
>> the problem caused by an interrupt storm.
>>
>> Replacing an int with a struct and providing sensible interfaces
>> for the watchdog code can keep it self contained to the interrupt
>> core code.
>
> So something like this makes a useful change log for this:
>
> Subject: genirq: Provide a snapshot mechanism for interrupt statistics
>
> The soft lockup detector lacks a mechanism to identify interrupt storms
> as root cause of a lockup. To enable this the detector needs a
> mechanism to snapshot the interrupt count statistics on a CPU when the
> detector observes a potential lockup scenario and compare that against
> the interrupt count when it warns about the lockup later on. The number
> of interrupts in that period give a hint whether the lockup might be
> caused by an interrupt storm.
>
> Instead of having extra storage in the lockup detector and accessing
> the internals of the interrupt descriptor directly, convert the per CPU
> irq_desc::kstat_irq member to a data structure which contains the
> counter plus a snapshot member and provide interfaces to take a
> snapshot of all interrupts on the current CPU and to retrieve the delta
> of a specific interrupt later on.
>
Thanks, the changelog you wrote very clearly articulates the purpose of
this patch.
> Hmm?
>
>> Signed-off-by: Bitao Hu <yaoma@...ux.alibaba.com>
>
> Interesting. You fully authored the patch?
>
> That's not how it works. You cannot take work from others and claim that
> it is yours. The minimal courtesy is to add a 'Originally-by:' tag.
>
I'm very sorry, the majority of this patch is your work, I will add an
'Originally-by:' tag.
>> diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
>> index 623b8136e9af..3ad40cf30c66 100644
>> --- a/kernel/irq/proc.c
>> +++ b/kernel/irq/proc.c
>> @@ -488,18 +488,15 @@ int show_interrupts(struct seq_file *p, void *v)
>> if (!desc || irq_settings_is_hidden(desc))
>> goto outsparse;
>>
>> - if (desc->kstat_irqs) {
>> - for_each_online_cpu(j)
>> - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
>> - }
>> + if (desc->kstat_irqs)
>> + any_count = data_race(desc->tot_count);
>
> This is an unrelated change and needs to be split out into a separate
> patch with a proper changelog which explains why this is equivalent.
>
Alright, I will remove this change witch is not related to the purpose
of this patch.
I guess that the purpose of suggesting this change in your V1 response
was to speedup the 'show_interrupts'. However, after reviewing the
usage of 'desc->tot_count' in 'unsigned int kstat_irqs(unsigned int
irq)', I think the change might be as follows:
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af..53b8d6edd7ac 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -489,8 +489,13 @@ int show_interrupts(struct seq_file *p, void *v)
goto outsparse;
if (desc->kstat_irqs) {
- for_each_online_cpu(j)
- any_count |=
data_race(per_cpu(desc->kstat_irqs->cnt, j));
+ if (!irq_settings_is_per_cpu_devid(desc) &&
+ !irq_settings_is_per_cpu(desc) &&
+ !irq_is_nmi(desc))
+ any_count = data_race(desc->tot_count);
+ else
+ for_each_online_cpu(j)
+ any_count |=
data_race(per_cpu(desc->kstat_irqs->cnt, j));
}
if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)
Is my idea correct?
Best Regards,
Bitao
Powered by blists - more mailing lists