[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15015345-3068-2fb8-aa38-f32acf27e1d0@igalia.com>
Date: Mon, 4 Mar 2024 18:43:05 -0300
From: "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To: John Ogness <john.ogness@...utronix.de>,
Jocelyn Falempe <jfalempe@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Daniel Vetter <daniel@...ll.ch>, Andrew Morton <akpm@...ux-foundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Kefeng Wang <wangkefeng.wang@...wei.com>, Lukas Wunner <lukas@...ner.de>,
Uros Bizjak <ubizjak@...il.com>, Petr Mladek <pmladek@...e.com>,
Daniel Thompson <daniel.thompson@...aro.org>,
Douglas Anderson <dianders@...omium.org>,
"Michael Kelley (LINUX)" <mikelley@...rosoft.com>
Cc: "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
David Airlie <airlied@...hat.com>, Thomas Zimmermann <tzimmermann@...e.de>
Subject: Re: [RFC] How to test panic handlers, without crashing the kernel
On 04/03/2024 18:12, John Ogness wrote:
> [...]
>> The second question is how to simulate a panic context in a
>> non-destructive way, so we can test the panic notifiers in CI, without
>> crashing the machine.
>
> I'm wondering if a "fake panic" can be implemented that quiesces all the
> other CPUs via NMI (similar to kdb) and then calls the panic
> notifiers. And finally releases everything back to normal. That might
> produce a fairly realistic panic situation and should be fairly
> non-destructive (depending on what the notifiers do and how long they
> take).
>
Hi Jocelyn / John,
one concern here is that the panic notifiers are kind of a no man's
land, so we can have very simple / safe ones, while others are
destructive in nature.
An example of a good behaving notifier that is destructive is the
Hyper-V one, that destroys an essential host-guest interface (called
"vmbus connection"). What happens if we trigger this one just for
testing purposes in a debugfs interface? Likely the guest would die...
[+CCing Michael Kelley here since he seems interested in panic and is
also expert in Hyper-V, just in case my example is bogus.]
So, maybe the problem could be split in 2: the non-notifiers portion of
the panic path, and the the notifiers; maybe restricting the notifiers
you'd run is a way to circumvent the risks, like if you could pass a
list of the specific notifiers you aim to test, this could be
interesting. Let's see what the others think and thanks for your work in
the DRM panic notifier =)
Cheers,
Guilherme
Powered by blists - more mailing lists