linux-kernel - Re: [PATCH V4] notifier/panic: Introduce panic_notifier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220130085038.GC29425@MiWiFi-R3L-srv>
Date:   Sun, 30 Jan 2022 16:50:38 +0800
From:   Baoquan He <bhe@...hat.com>
To:     Petr Mladek <pmladek@...e.com>
Cc:     "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        HATAYAMA Daisuke <d.hatayama@...fujitsu.com>,
        kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
        dyoung@...hat.com, linux-doc@...r.kernel.org, vgoyal@...hat.com,
        stern@...land.harvard.edu, akpm@...ux-foundation.org,
        andriy.shevchenko@...ux.intel.com, corbet@....net,
        halves@...onical.com, kernel@...ccoli.net,
        Will Deacon <will@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        John Ogness <john.ogness@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Juergen Gross <jgross@...e.com>, mikelley@...rosoft.com
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/26/22 at 02:20pm, Petr Mladek wrote:
> On Wed 2022-01-26 11:10:39, Baoquan He wrote:
> > On 01/24/22 at 11:48am, Guilherme G. Piccoli wrote:
> > > On 24/01/2022 10:59, Baoquan He wrote:
> > > > [...]
> > > > About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> > > > will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> > > > is specified in kernel cmdline. 
> > > 
> > > Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
> > > responsible for really *non-intrusive/non-risky* tasks and should be
> > > always executed in the panic path (before kdump), regardless of
> > > "crash_kexec_post_notifiers".
> > > 
> > > The idea is that the majority of the notifiers would be executed in the
> > > post_dump portion, and for that, we have the
> > > "crash_kexec_post_notifiers" conditional. I also suggest we have
> > > blacklist options (based on function names) for both notifiers, in order
> > > to make kdump issues debug easier.
> > > 
> > > Do you agree with that? Feel free to comment with suggestions!
> > > Cheers,
> > 
> > I would say "please NO" cautiously.
> > 
> > As Petr said, kdump mostly works only if people configure it correctly.
> > That's because we try best to switch to kdump kernel from the fragile
> > panicked kernel immediately. When we try to add anthing before the switching,
> > please consider carefully and ask if that adding is mandatory, otherwise
> > switching into kdump kernel may fail. If the answer is yes, the adding
> > is needed and welcomed. Othewise, any unnecessary action, including any
> > "non-intrusive/non-risky" tasks, would be unwelcomed.
> 
> I still do not have the complete picture. But it seems that some
> actions make always sense even for kdump:
> 
>     + Super safe operations that might disable churn from broken
>       system. For examle, disabling watchdogs by setting a single
>       variable, see rcu_panic() notifier
> 
>     + Actions needed that allow to kexec the crash kernel a safe
>       way under some hypervisor, see
>       https://lore.kernel.org/r/MWHPR21MB15933573F5C81C5250BF6A1CD75E9@MWHPR21MB1593.namprd21.prod.outlook.com

Yes, I agree with this after going through threads of discussion again.
There is much space we can do something for panic_notifier, and it might
be a good time to do now with these discussion and some clarification.

> 
> 
> > Surely, we don't oppose the "non-intrusive/non-risky" or completely
> > "intrusive/risky" action adding before kdump kernel switching, with a
> > conditional limitation. When we handle customers' kdump support, we
> > explicitly declare we only support normal and default kdump operation.
> > If any action which is done before switching into kdump kernel is specified,
> > e.g panic_notifier, panic_print, they need take care of their own kdump
> > failure.
> 
> All this actually started because of kmsg_dump. It might make sense to
> allow both kmsg_dump and kdump together. The messages stored by
> kmsg_dump might be better than nothing when kdump fails.

I think this can be done later, after panics notifiers are combed and
tidied up.

> 
> It actually seems to be the main motivation to introduce
> "crash_kexec_post_notifier" parameter, see the commit
> f06e5153f4ae2e2f3b03 ("kernel/panic.c: add "crash_kexec_post_notifiers"
> option for kdump after panic_notifers").

>From discussion with Hitachi and FJ engineers, they use
crash_kexec_post_notifiers when 1st kernel panicked and kdump kernel is
not so stable to function. In that case, the captured information 
with best effort after panic in 1st kernel can help analyze what
happened in 1st kernel, and also might give hint on why kdump kernel
is unstbale. But they will not add crash_kexec_post_notifiers in kernel
cmdline by default. The unstable kdump kernel rarely happened, need be
debugged and investigated. 

> 
> And this patch introduces panic_notifier_filter that tries to select
> notifiers that are helpful and harmful. IMHO, it is almost unusable.
> It seems that even kernel developers do not understand what exactly
> some notifiers do and why they are needed. Usually only the author
> and people familiar with the subsystem have some idea. It makes
> it pretty hard for anyone to create a reasonable filter.

Then people can select the notifiers they know to execute. E.g for
Hyper-V, they can run the notifiers related. And as HATAYAMA mentioned,
they have the same expection. This kind of filter is not for people to
blindly pick and run, but for the professional.

I think the chaos is from not being monitored entirely. People can
freely register one when they need. The priority, impaction to the
entirety is not considered by subsystem developer.

> 
> I am pretty sure that we could do better. I propose to add more
> notifier lists that will be called at various places with reasonable
> rules and restrictions. Then the susbsystem maintainers could decide
> where exactly a given action must be done.

Agree that we can do something to improve.