[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEEQ3wn0UckgNTJCPCNDtHW8DwWAJR1PXURvH57yfHATYqs7Gw@mail.gmail.com>
Date: Sun, 28 Sep 2025 10:38:10 +0800
From: yunhui cui <cuiyunhui@...edance.com>
To: Doug Anderson <dianders@...omium.org>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, ziqianlu@...edance.com
Cc: akpm@...ux-foundation.org, alex@...ti.fr, anup@...infault.org,
aou@...s.berkeley.edu, atish.patra@...ux.dev, catalin.marinas@....com,
johannes@...solutions.net, lihuafei1@...wei.com, mark.rutland@....com,
masahiroy@...nel.org, maz@...nel.org, mingo@...nel.org,
nicolas.schier@...ux.dev, palmer@...belt.com, paul.walmsley@...ive.com,
suzuki.poulose@....com, thorsten.blum@...ux.dev, wangjinchao600@...il.com,
will@...nel.org, yangyicong@...ilicon.com, zhanjie9@...ilicon.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, linux-riscv@...ts.infradead.org
Subject: Re: [External] Re: [PATCH v2 1/2] watchdog: move arm64 watchdog_hld
into common code
Hi Doug,
On Sat, Sep 27, 2025 at 4:58 AM Doug Anderson <dianders@...omium.org> wrote:
>
> Hi,
>
> On Fri, Sep 26, 2025 at 2:37 AM yunhui cui <cuiyunhui@...edance.com> wrote:
> >
> > Hi Doug,
> >
> > On Fri, Sep 26, 2025 at 4:00 AM Doug Anderson <dianders@...omium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Sep 25, 2025 at 1:48 AM Yunhui Cui <cuiyunhui@...edance.com> wrote:
> > > >
> > > > @@ -17,6 +17,7 @@
> > > > #include <linux/cpu_pm.h>
> > > > #include <linux/export.h>
> > > > #include <linux/kernel.h>
> > > > +#include <linux/nmi.h>
> > > > #include <linux/perf/arm_pmu.h>
> > > > #include <linux/slab.h>
> > > > #include <linux/sched/clock.h>
> > > > @@ -696,10 +697,12 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
> > > > return per_cpu(hw_events->irq, cpu);
> > > > }
> > > >
> > > > -bool arm_pmu_irq_is_nmi(void)
> > > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> > > > +bool arch_perf_nmi_is_available(void)
> > > > {
> > > > return has_nmi;
> > > > }
> > > > +#endif
> > >
> > > Should the previous comment move here, AKA:
> > >
> > > /*
> > > * hardlockup_detector_perf_init() will success even if Pseudo-NMI turns off,
> >
> > Okay, we also need to change it to “watchdog_hardlockup_probe()”
> >
> > > * however, the pmu interrupts will act like a normal interrupt instead of
> > > * NMI and the hardlockup detector would be broken.
> > > */
> > >
> > >
> > > > +static int __init init_watchdog_freq_notifier(void)
> > > > +{
> > > > + return cpufreq_register_notifier(&watchdog_freq_notifier,
> > > > + CPUFREQ_POLICY_NOTIFIER);
> > >
> > > I think you need to do something to prevent this from happening on any
> > > platforms that override hw_nmi_get_sample_period(), right? These
> > > cpufreq notifiers will be useless in that case...
> >
> > I understand this is not a problem. watchdog_perf uses
> > PERF_COUNT_HW_CPU_CYCLES, which means it is inherently limited by the
> > CPU's main frequency. After we make such a change, a larger value may
> > be used as the period, so the NMI period will become longer, but this
> > value will not change after the system starts.
>
> I'm not sure I follow. On x86, hw_nmi_get_sample_period() is:
>
> u64 hw_nmi_get_sample_period(int watchdog_thresh)
> {
> return (u64)(cpu_khz) * 1000 * watchdog_thresh;
> }
I have added the maintainers for arch/x86.
1. cpu_khz can be understood as returning the base operating frequency
of a CPU, such as 2.3GHz. In practice, the CPU's core frequency may
downclock to 800MHz under low load and overclock to 4.4GHz under high
load.
2. Because the event provided to the PMU has the
PERF_COUNT_HW_CPU_CYCLES attribute, and the counter's value is based
on 2.3GHz, the execution cycle of watchdog_overflow_callback() is not
fixed; it varies with the CPU's core frequency. When the CPU runs at a
frequency higher than 2.3GHz, the NMI cycle will shorten; otherwise,
it will lengthen.
3. After our modification, if the architecture is not integrated with
cpufreq, it returns 0 and will not update the cycle. If integrated
with cpufreq, it returns the maximum frequency supported by the CPU,
so the NMI cycle is only slightly lengthened, with no impact on the
actual hardlockup detection function.
4. I have also conducted tests:
stress-ng --cpu 1 --taskset 1 --cpu-load 80
echo 800000 > scaling_max_freq
turbostat shows that Bzy_MHz and TSC_MHz are 800 and 2300 respectively.
And the NMI cycle became approximately 30 seconds:
[ 2309.442743] NMI watchdog: ------ watchdog overflow callback, cpu = 1
[ 2341.526032] NMI watchdog: ------ watchdog overflow callback, cpu = 1
>
> On PowerPC it's:
>
> u64 hw_nmi_get_sample_period(int watchdog_thresh)
> {
> return ppc_proc_freq * watchdog_thresh;
> }
>
> Neither of those are necessarily based on the results the cpufreq
> reports. ...so therefore I don't think you should be taking the
> cpufreq reported numbers and assuming they're OK on x86 and powerpc.
>
> -Doug
>
Thanks,
Yunhui
Powered by blists - more mailing lists