[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXTFHcCE8pf5qgEf1AVODs2+r+_nDUOiWgdQeEgUBHzfA@mail.gmail.com>
Date: Tue, 7 Oct 2025 15:45:07 -0700
From: Ian Rogers <irogers@...gle.com>
To: Doug Anderson <dianders@...omium.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Jinchao Wang <wangjinchao600@...il.com>,
Namhyung Kim <namhyung@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>, Yunhui Cui <cuiyunhui@...edance.com>, akpm@...ux-foundation.org,
catalin.marinas@....com, maddy@...ux.ibm.com, mpe@...erman.id.au,
npiggin@...il.com, christophe.leroy@...roup.eu, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
acme@...nel.org, mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...nel.org, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
kees@...nel.org, masahiroy@...nel.org, aliceryhl@...gle.com, ojeda@...nel.org,
thomas.weissschuh@...utronix.de, xur@...gle.com, ruanjinjie@...wei.com,
gshan@...hat.com, maz@...nel.org, suzuki.poulose@....com,
zhanjie9@...ilicon.com, yangyicong@...ilicon.com, gautam@...ux.ibm.com,
arnd@...db.de, zhao.xichao@...o.com, rppt@...nel.org, lihuafei1@...wei.com,
coxu@...hat.com, jpoimboe@...nel.org, yaozhenguo1@...il.com,
luogengkun@...weicloud.com, max.kellermann@...os.com, tj@...nel.org,
yury.norov@...il.com, thorsten.blum@...ux.dev, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, linux-perf-users@...r.kernel.org
Subject: Re: [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector
On Tue, Oct 7, 2025 at 2:43 PM Doug Anderson <dianders@...omium.org> wrote:
...
> The buddy watchdog was pretty much following the conventions that were
> already in the code: that the hardlockup detector (whether backed by
> perf or not) was essentially called the "nmi watchdog". There were a
> number of people that were involved in reviews and I don't believe
> suggesting creating a whole different mechanism for enabling /
> disabling the buddy watchdog was never suggested.
I suspect they lacked the context that 1 in the nmi_watchdog is taken
to mean there's a perf event in use by the kernel with implications on
how group events behave. This behavior has been user
visible/advertised for 9 years. I don't doubt that there were good
intentions by PowerPC's watchdog and in the buddy watchdog patches in
using the file, that use will lead to spurious warnings and behaviors
by perf.
My points remain:
1) using multiple files regresses perf's performance;
2) the file name by its meaning is wrong;
3) old perf tools on new kernels won't behave as expected wrt warnings
and metrics because the meaning of the file has changed.
Using a separate file for each watchdog resolves this. It seems that
there wasn't enough critical mass for getting this right to have
mattered before, but that doesn't mean we shouldn't get it right now.
Thanks,
Ian
Powered by blists - more mailing lists