linux-kernel - Re: [PATCH] watchdog/hardlockup: Avoid large stack frames in watchdog_hardlockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZMp+rgDT5jYhi/1p@dhcp22.suse.cz>
Date:   Wed, 2 Aug 2023 18:05:02 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Doug Anderson <dianders@...omium.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Petr Mladek <pmladek@...e.com>,
        kernel test robot <lkp@...el.com>,
        Lecopzer Chen <lecopzer.chen@...iatek.com>,
        Pingfan Liu <kernelfans@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] watchdog/hardlockup: Avoid large stack frames in
 watchdog_hardlockup_check()

On Wed 02-08-23 07:12:29, Doug Anderson wrote:
> Hi,
> 
> On Wed, Aug 2, 2023 at 12:27 AM Michal Hocko <mhocko@...e.com> wrote:
> >
> > On Tue 01-08-23 08:41:49, Doug Anderson wrote:
> > [...]
> > > Ah, I see what you mean. The one issue I have with your solution is
> > > that the ordering of the stack crawls is less ideal in the "dump all"
> > > case when cpu != this_cpu. We really want to see the stack crawl of
> > > the locked up CPU first and _then_ see the stack crawls of other CPUs.
> > > With your solution the locked up CPU will be interspersed with all the
> > > others and will be harder to find in the output (you've got to match
> > > it up with the "Watchdog detected hard LOCKUP on cpu N" message).
> > > While that's probably not a huge deal, it's nicer to make the output
> > > easy to understand for someone trying to parse it...
> >
> > Is it worth to waste memory for this arguably nicer output? Identifying
> > the stack of the locked up CPU is trivial.
> 
> I guess it's debatable, but as someone who has spent time staring at
> trawling through reports generated like this, I'd say "yes", it's
> super helpful in understanding the problem to have the hung CPU first.

Well, I have to admit that most lockdep splats I have dealt with
recently do not come with sysctl_hardlockup_all_cpu_backtrace so I
cannot really judge.

> Putting the memory usage in perspective:
> 
> * On a kernel built with a more normal number of max CPUs, like 256,
> this is only a use of 32 bytes of memory. That's 8 CPU instructions
> worth of memory.

Think of distribution kernels that many people use. E.g SLES kernel uses
8k CONFIG_NR_CPUS

> * Even on a system with the largest number of max CPUs we currently
> allow (8192), this is only a use of 1024 bytes of memory. Sure, that's
> a big chunk, but this is also something on our largest systems.

This is independent on the size of the machine if you are using
pre-built kernels.

> In any case, how about this. We only need the memory allocated if
> `sysctl_hardlockup_all_cpu_backtrace` is non-zero. I can hook in
> whenever that's changed (should be just at bootup) and then kmalloc
> memory then.

this is certainly better than the original proposal

> This really limits the extra memory to just cases when
> it's useful. Presumably on systems that are designed to run massively
> SMP they wouldn't want to turn this knob on anyway since it would spew
> far too much data. If you took a kernel compiled for max SMP, ran it
> on a machine with only a few cores, and wanted this feature turned on
> then at most you'd be chewing up 1K. In the average case this would
> chew up some extra memory (extra CPU instructions to implement the
> function take code space, extra overhead around kmalloc) but it would
> avoid the 1K chunk in most cases.
> 
> Does that sound reasonable?

If the locked up cpu needs to be first is a real requirement (and this
seems debateable) then sure why not. I do not feel strongly to argue one
way or the other, maybe others have an opinion on that.

> I guess my last alternative would be to keep the special case of
> tracing the hung CPU first (this is the most important part IMO) and
> then accept the double trace, AKA:

That sounds wrong.
 
> /* Try to avoid re-dumping the stack on the hung CPU if possible */
> if (cpu == this_cpu))
>   trigger_allbutself_cpu_backtrace();
> else
>   trigger_all_cpu_backtrace();
> 
> -Doug

-- 
Michal Hocko
SUSE Labs