lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFA6WYO01=4KmE=MJFL9btSvP8-RmQHMO49pLwUrgR9-LFm5+Q@mail.gmail.com>
Date:   Fri, 24 Sep 2021 10:48:51 +0530
From:   Sumit Garg <sumit.garg@...aro.org>
To:     Pingfan Liu <kernelfans@...il.com>
Cc:     Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Marc Zyngier <maz@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Masahiro Yamada <masahiroy@...nel.org>,
        Sami Tolvanen <samitolvanen@...gle.com>,
        Petr Mladek <pmladek@...e.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Wang Qing <wangqing@...o.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Santosh Sivaraj <santosh@...six.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector

Hi Pingfan,

On Thu, 23 Sept 2021 at 19:59, Pingfan Liu <kernelfans@...il.com> wrote:
>
> On Thu, Sep 23, 2021 at 10:10 PM Pingfan Liu <kernelfans@...il.com> wrote:
> >
> > From: Sumit Garg <sumit.garg@...aro.org>
> >
> To Sumit, I think credits should go to you and keep you as the author.
>

Thanks, I am fine with it. If you like then you can add your
"Co-developed-by" as well.

> Please let me know if you dislike it.
>
> Thanks,
>
> Pingfan
> > With the recent feature added to enable perf events to use pseudo NMIs
> > as interrupts on platforms which support GICv3 or later, its now been
> > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > platforms. So enable corresponding support.
> >
> > One thing to note here is that normally lockup detector is initialized
> > just after the early initcalls but PMU on arm64 comes up much later as
> > device_initcall(). So we need to re-initialize lockup detection once
> > PMU has been initialized.

This needs to be updated to reflect delayed initialization instead.

-Sumit

> >
> > [1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org
> >
> > Signed-off-by: Sumit Garg <sumit.garg@...aro.org>
> > (Pingfan: adapt it to watchdog_hld async model based on [1])
> > Signed-off-by: Pingfan Liu <kernelfans@...il.com>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Ingo Molnar <mingo@...hat.com>
> > Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
> > Cc: Mark Rutland <mark.rutland@....com>
> > Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
> > Cc: Jiri Olsa <jolsa@...hat.com>
> > Cc: Namhyung Kim <namhyung@...nel.org>
> > Cc: Marc Zyngier <maz@...nel.org>
> > Cc: Kees Cook <keescook@...omium.org>
> > Cc: Masahiro Yamada <masahiroy@...nel.org>
> > Cc: Sami Tolvanen <samitolvanen@...gle.com>
> > Cc: Petr Mladek <pmladek@...e.com>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Wang Qing <wangqing@...o.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@...radead.org>
> > Cc: Santosh Sivaraj <santosh@...six.org>
> > Cc: linux-kernel@...r.kernel.org
> > To: linux-arm-kernel@...ts.infradead.org
> > ---
> >  arch/arm64/Kconfig               |  2 ++
> >  arch/arm64/kernel/Makefile       |  1 +
> >  arch/arm64/kernel/perf_event.c   | 11 ++++++++--
> >  arch/arm64/kernel/watchdog_hld.c | 36 ++++++++++++++++++++++++++++++++
> >  drivers/perf/arm_pmu.c           |  5 +++++
> >  include/linux/perf/arm_pmu.h     |  2 ++
> >  6 files changed, 55 insertions(+), 2 deletions(-)
> >  create mode 100644 arch/arm64/kernel/watchdog_hld.c
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 5c7ae4c3954b..8287e9e1d28d 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -189,6 +189,8 @@ config ARM64
> >         select HAVE_NMI
> >         select HAVE_PATA_PLATFORM
> >         select HAVE_PERF_EVENTS
> > +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
> > +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
> >         select HAVE_PERF_REGS
> >         select HAVE_PERF_USER_STACK_DUMP
> >         select HAVE_REGS_AND_STACK_ACCESS_API
> > diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> > index 3f1490bfb938..789c2fe5bb90 100644
> > --- a/arch/arm64/kernel/Makefile
> > +++ b/arch/arm64/kernel/Makefile
> > @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)                 += module.o
> >  obj-$(CONFIG_ARM64_MODULE_PLTS)                += module-plts.o
> >  obj-$(CONFIG_PERF_EVENTS)              += perf_regs.o perf_callchain.o
> >  obj-$(CONFIG_HW_PERF_EVENTS)           += perf_event.o
> > +obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o
> >  obj-$(CONFIG_HAVE_HW_BREAKPOINT)       += hw_breakpoint.o
> >  obj-$(CONFIG_CPU_PM)                   += sleep.o suspend.o
> >  obj-$(CONFIG_CPU_IDLE)                 += cpuidle.o
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index b4044469527e..a34343d0f418 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/platform_device.h>
> >  #include <linux/sched_clock.h>
> >  #include <linux/smp.h>
> > +#include <linux/nmi.h>
> >
> >  /* ARMv8 Cortex-A53 specific event types. */
> >  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> > @@ -1284,10 +1285,16 @@ static struct platform_driver armv8_pmu_driver = {
> >
> >  static int __init armv8_pmu_driver_init(void)
> >  {
> > +       int ret;
> > +
> >         if (acpi_disabled)
> > -               return platform_driver_register(&armv8_pmu_driver);
> > +               ret = platform_driver_register(&armv8_pmu_driver);
> >         else
> > -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +
> > +       hld_detector_delay_initialized = true;
> > +       wake_up(&hld_detector_wait);
> > +       return ret;
> >  }
> >  device_initcall(armv8_pmu_driver_init)
> >
> > diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
> > new file mode 100644
> > index 000000000000..379743e0d001
> > --- /dev/null
> > +++ b/arch/arm64/kernel/watchdog_hld.c
> > @@ -0,0 +1,36 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/nmi.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/perf/arm_pmu.h>
> > +
> > +/*
> > + * Safe maximum CPU frequency in case a particular platform doesn't implement
> > + * cpufreq driver. Although, architecture doesn't put any restrictions on
> > + * maximum frequency but 5 GHz seems to be safe maximum given the available
> > + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> > + * hand, we can't make it much higher as it would lead to a large hard-lockup
> > + * detection timeout on parts which are running slower (eg. 1GHz on
> > + * Developerbox) and doesn't possess a cpufreq driver.
> > + */
> > +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> > +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> > +{
> > +       unsigned int cpu = smp_processor_id();
> > +       unsigned long max_cpu_freq;
> > +
> > +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> > +       if (!max_cpu_freq)
> > +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> > +
> > +       return (u64)max_cpu_freq * watchdog_thresh;
> > +}
> > +
> > +int __init watchdog_nmi_probe(void)
> > +{
> > +       if (!hld_detector_delay_initialized)
> > +               return -EBUSY;
> > +       else if (!arm_pmu_irq_is_nmi())
> > +               return -ENODEV;
> > +
> > +       return hardlockup_detector_perf_init();
> > +}
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 3cbc3baf087f..2aecb0c34290 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
> >         return per_cpu(hw_events->irq, cpu);
> >  }
> >
> > +bool arm_pmu_irq_is_nmi(void)
> > +{
> > +       return has_nmi;
> > +}
> > +
> >  /*
> >   * PMU hardware loses all context when a CPU goes offline.
> >   * When a CPU is hotplugged back in, since some hardware registers are
> > diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> > index 505480217cf1..bf7966776c55 100644
> > --- a/include/linux/perf/arm_pmu.h
> > +++ b/include/linux/perf/arm_pmu.h
> > @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
> >  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
> >  #endif
> >
> > +bool arm_pmu_irq_is_nmi(void);
> > +
> >  /* Internal functions only for core arm_pmu code */
> >  struct arm_pmu *armpmu_alloc(void);
> >  struct arm_pmu *armpmu_alloc_atomic(void);
> > --
> > 2.31.1
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ