linux-kernel - Re: [PATCH] watchdog/hardlockup: Avoid large stack frames in watchdog_hardlockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAD=FV=V5hx7Zy-XMB=sPYcD_h-iP5VknmEoJwvw3Akd_1wDnRw@mail.gmail.com>
Date:   Tue, 1 Aug 2023 07:16:15 -0700
From:   Doug Anderson <dianders@...omium.org>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Petr Mladek <pmladek@...e.com>,
        kernel test robot <lkp@...el.com>,
        Lecopzer Chen <lecopzer.chen@...iatek.com>,
        Pingfan Liu <kernelfans@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] watchdog/hardlockup: Avoid large stack frames in watchdog_hardlockup_check()

Hi,

On Tue, Aug 1, 2023 at 5:58 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 31-07-23 09:17:59, Douglas Anderson wrote:
> > After commit 77c12fc95980 ("watchdog/hardlockup: add a "cpu" param to
> > watchdog_hardlockup_check()") we started storing a `struct cpumask` on
> > the stack in watchdog_hardlockup_check(). On systems with
> > CONFIG_NR_CPUS set to 8192 this takes up 1K on the stack. That
> > triggers warnings with `CONFIG_FRAME_WARN` set to 1024.
> >
> > Instead of putting this `struct cpumask` on the stack, let's declare
> > it as `static`. This has the downside of taking up 1K of memory all
> > the time on systems with `CONFIG_NR_CPUS` to 8192, but on systems with
> > smaller `CONFIG_NR_CPUS` it's not much emory (with 128 CPUs it's only
> > 16 bytes of memory). Presumably anyone building a system with
> > `CONFIG_NR_CPUS=8192` can afford the extra 1K of memory.
> >
> > NOTE: as part of this change, we no longer check the return value of
> > trigger_single_cpu_backtrace(). While we could do this and only call
> > cpumask_clear_cpu() if trigger_single_cpu_backtrace() didn't fail,
> > that's probably not worth it. There's no reason to believe that
> > trigger_cpumask_backtrace() will succeed at backtracing the CPU when
> > trigger_single_cpu_backtrace() failed.
> >
> > Alternatives considered:
> > - Use kmalloc with GFP_ATOMIC to allocate. I decided against this
> >   since relying on kmalloc when the system is hard locked up seems
> >   like a bad idea.
> > - Change the arch_trigger_cpumask_backtrace() across all architectures
> >   to take an extra parameter to get the needed behavior. This seems
> >   like a lot of churn for a small savings.
> >
> > Fixes: 77c12fc95980 ("watchdog/hardlockup: add a "cpu" param to watchdog_hardlockup_check()")
> > Reported-by: kernel test robot <lkp@...el.com>
> > Closes: https://lore.kernel.org/r/202307310955.pLZDhpnl-lkp@intel.com
> > Signed-off-by: Douglas Anderson <dianders@...omium.org>
> > ---
> >
> >  kernel/watchdog.c | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index be38276a365f..19db2357969a 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -151,9 +151,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> >        */
> >       if (is_hardlockup(cpu)) {
> >               unsigned int this_cpu = smp_processor_id();
> > -             struct cpumask backtrace_mask;
> > -
> > -             cpumask_copy(&backtrace_mask, cpu_online_mask);
> >
> >               /* Only print hardlockups once. */
> >               if (per_cpu(watchdog_hardlockup_warned, cpu))
> > @@ -167,10 +164,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> >                               show_regs(regs);
> >                       else
> >                               dump_stack();
> > -                     cpumask_clear_cpu(cpu, &backtrace_mask);
> >               } else {
> > -                     if (trigger_single_cpu_backtrace(cpu))
> > -                             cpumask_clear_cpu(cpu, &backtrace_mask);
> > +                     trigger_single_cpu_backtrace(cpu);
> >               }
> >
> >               /*
> > @@ -178,8 +173,13 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> >                * hardlockups generating interleaving traces
> >                */
> >               if (sysctl_hardlockup_all_cpu_backtrace &&
> > -                 !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped))
> > +                 !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped)) {
> > +                     static struct cpumask backtrace_mask;
> > +
> > +                     cpumask_copy(&backtrace_mask, cpu_online_mask);
> > +                     cpumask_clear_cpu(cpu, &backtrace_mask);
> >                       trigger_cpumask_backtrace(&backtrace_mask);
>
> This looks rather wasteful to just copy the cpumask over to
> backtrace_mask in nmi_trigger_cpumask_backtrace (which all but sparc
> arches do AFAICS).
>
> Would it be possible to use arch_trigger_cpumask_backtrace(cpu_online_mask, false)
> and special case cpu != this_cpu && sysctl_hardlockup_all_cpu_backtrace?

So you're saying optimize the case where cpu == this_cpu and then have
a special case (where we still copy) for cpu != this_cpu? I can do
that if that's what people want, but (assuming I understand correctly)
that's making the wrong tradeoff. Specifically, this code runs one
time right as we're crashing and if it takes an extra millisecond to
run it's not a huge deal. It feels better to avoid the special case
and keep the code smaller.

Let me know if I misunderstood.

-Doug