[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=V8VcmEDDpaWZi40j5dkP+HyBPFr=D_mKFG-YmXTpa_AA@mail.gmail.com>
Date: Tue, 6 Feb 2024 13:42:02 -0800
From: Doug Anderson <dianders@...omium.org>
To: Bitao Hu <yaoma@...ux.alibaba.com>
Cc: akpm@...ux-foundation.org, pmladek@...e.com, kernelfans@...il.com,
liusong@...ux.alibaba.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv5 2/3] watchdog/softlockup: report the most frequent interrupts
Hi,
On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@...ux.alibaba.com> wrote:
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 71d5b6dfa358..26dc1ad86276 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -18,6 +18,9 @@
> #include <linux/init.h>
> #include <linux/kernel_stat.h>
> #include <linux/math64.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> +#include <linux/bitops.h>
These are still not sorted alphabetically. "irq.h" and "irqdesc.h"
should go between "init.h" and "kernel_stat.h". "bitops.h" is trickier
because the existing headers are not quite sorted. Probably the best
would be to fully sort them. They should end up like this:
#include <linux/bitops.h>
#include <linux/cpu.h>
#include <linux/init.h>
#include <linux/irq.h>
#include <linux/irqdesc.h>
#include <linux/kernel_stat.h>
#include <linux/kvm_para.h>
#include <linux/math64.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/nmi.h>
#include <linux/stop_machine.h>
#include <linux/sysctl.h>
#include <linux/tick.h>
#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
#include <linux/sched/isolation.h>
#include <asm/irq_regs.h>
> +static void start_counting_irqs(void)
> +{
> + int i;
> + struct irq_desc *desc;
> + u32 *counts = __this_cpu_read(hardirq_counts);
> + int cpu = smp_processor_id();
> +
> + if (!test_bit(cpu, softlockup_hardirq_cpus)) {
I don't think you need "softlockup_hardirq_cpus", do you? Just read
"actual_nr_irqs" and see if it's non-zero? ...or read "hardirq_counts"
and see if it's non-NULL?
> + /*
> + * nr_irqs has the potential to grow at runtime. We should read
> + * it and store locally to avoid array out-of-bounds access.
> + */
> + __this_cpu_write(actual_nr_irqs, nr_irqs);
nit: IMO store nr_irqs in a local variable to avoid all of the
"__this_cpu_read" calls everywhere. Then just write it once from your
local variable.
> + counts = kmalloc_array(__this_cpu_read(actual_nr_irqs),
> + sizeof(u32),
> + GFP_ATOMIC);
should use "kcalloc()" so the array is zeroed. That way if the set of
non-NULL "desc"s changes between calls you don't end up reading
uninitialized memory.
> +static void stop_counting_irqs(void)
> +{
> + u32 *counts = __this_cpu_read(hardirq_counts);
> + int cpu = smp_processor_id();
> +
> + if (test_bit(cpu, softlockup_hardirq_cpus)) {
> + kfree(counts);
> + counts = NULL;
> + __this_cpu_write(hardirq_counts, counts);
nit: don't really need to set the local "counts" to NULL. Just:
__this_cpu_write(hardirq_counts, NULL);
..and actually if you take my advice above and get rid of
"softlockup_hardirq_cpus" then this function just becomes:
kfree(__this_cpu_read(hardirq_counts));
__this_cpu_write(hardirq_counts, NULL);
Since kfree() handles when you pass it NULL...
> +static void print_irq_counts(void)
> +{
> + int i;
> + struct irq_desc *desc;
> + u32 counts_diff;
> + u32 *counts = __this_cpu_read(hardirq_counts);
> + int cpu = smp_processor_id();
> + struct irq_counts irq_counts_sorted[NUM_HARDIRQ_REPORT] = {
> + {-1, 0}, {-1, 0}, {-1, 0}, {-1, 0},
> + };
> +
> + if (test_bit(cpu, softlockup_hardirq_cpus)) {
> + for_each_irq_desc(i, desc) {
> + if (!desc)
> + continue;
The "if" test above isn't needed. The "for_each_irq_desc()" macro
already checks for NULL.
> + /*
> + * We need to bounds-check in case someone on a different CPU
> + * expanded nr_irqs.
> + */
> + if (i < __this_cpu_read(actual_nr_irqs))
> + counts_diff = desc->kstat_irqs ?
> + *this_cpu_ptr(desc->kstat_irqs) - counts[i] : 0;
> + else
> + counts_diff = desc->kstat_irqs ?
> + *this_cpu_ptr(desc->kstat_irqs) : 0;
Why do you need to test "kstat_irqs" for 0? Also, ideally don't
duplicate the math. In other words, I'd expect this (untested):
if (i < __this_cpu_read(actual_nr_irqs))
count = counts[i];
else
count = 0;
counts_diff = *this_cpu_ptr(desc->kstat_irqs) - count;
I guess I'd also put "__this_cpu_read(actual_nr_irqs)" in a local
variable like you do with counts...
Powered by blists - more mailing lists