linux-kernel - Re: [PATCH V4] notifier/panic: Introduce panic_notifier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yel8WQiBn/HNQN83@alley>
Date:   Thu, 20 Jan 2022 16:14:33 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     "Guilherme G. Piccoli" <gpiccoli@...lia.com>
Cc:     kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
        dyoung@...hat.com, linux-doc@...r.kernel.org, bhe@...hat.com,
        vgoyal@...hat.com, stern@...land.harvard.edu,
        akpm@...ux-foundation.org, andriy.shevchenko@...ux.intel.com,
        corbet@....net, halves@...onical.com, kernel@...ccoli.net,
        Will Deacon <will@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        HATAYAMA Daisuke <d.hatayama@...fujitsu.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        John Ogness <john.ogness@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Juergen Gross <jgross@...e.com>
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

Adding some more people into Cc. Some modified the logic in the past.
Some are familiar with some interesting areas where the panic
notfiers are used.

On Sat 2022-01-08 12:34:51, Guilherme G. Piccoli wrote:
> The kernel notifier infrastructure allows function callbacks to be
> added in multiple lists, which are then called in the proper time,
> like in a reboot or panic event. The panic_notifier_list specifically
> contains the callbacks that are executed during a panic event. As any
> other notifier list, the panic one has no filtering and all functions
> previously registered are executed.
> 
> The kdump infrastructure, on the other hand, enables users to set
> a crash kernel that is kexec'ed in a panic event, and vmcore/logs
> are collected in such crash kernel. When kdump is set, by default
> the panic notifiers are ignored - the kexec jumps to the crash kernel
> before the list is checked and callbacks executed.
> 
> There are some cases though in which kdump users might want to
> allow panic notifier callbacks to execute _before_ the kexec to
> the crash kernel, for a variety of reasons - for example, users
> may think kexec is very prone to fail and want to give a chance
> to kmsg dumpers to run (and save logs using pstore),

Yes, this seems to be original intention for the
"crash_kexec_post_notifiers" option, see the commit
f06e5153f4ae2e2f3b0300f ("kernel/panic.c: add
"crash_kexec_post_notifiers" option for kdump after panic_notifiers")


> some panic notifier is required to properly quiesce some hardware
> that must be used to the crash kernel.

Do you have any example, please? The above mentioned commit
says "crash_kexec_post_notifiers" actually increases risk
of kdump failure.

Note that kmsg_dump() is called after the notifiers only because
some are printing more information, see the commit
6723734cdff15211bb78a ("panic: call panic handlers before kmsg_dump").
They might still increase the change that kmsg_dump() will never
be called.


> But there's a problem: currently it's an "all-or-nothing" situation,
> the kdump user choice is either to execute all panic notifiers or
> none of them. Given that panic notifiers may increase the risk of a
> kdump failure, this is a tough decision and may affect the debug of
> hard to reproduce bugs, if for some reason the user choice is to
> enable panic notifiers, but kdump then fails.
>
> So, this patch aims to ease this decision: we hereby introduce a filter
> for the panic notifier list, in which users may select specifically
> which callbacks they wish to run, allowing a safer kdump. The allowlist
> should be provided using the parameter "panic_notifier_filter=a,b,..."
> where a, b are valid callback names. Invalid symbols are discarded.

I am afraid that this is almost unusable solution:

   + requires deep knowledge of what each notifier does
   + might need debugging what notifier causes problems
   + the list might need to be updated when new notifiers are added
   + function names are implementation detail and might change
   + requires kallsyms


It is only workaround for a real problem. The problem is that
"panic_notifier_list" is used for many purposes that break
each other.

I checked some notifiers and found few groups:

   + disable watchdogs:
      + hung_task_panic()
      + rcu_panic()

   + dump information:
      + kernel_offset_notifier()
      + trace_panic_handler()     (duplicate of panic_print=0x10)

   + inform hypervisor
      + xen_panic_event()
      + pvpanic_panic_notify()
      + hyperv_panic_event()

   + misc cleanup / flush / blinking
      + panic_event()   in ipmi_msghandler.c
      + panic_happened()   in heartbeat.c
      + led_trigger_panic_notifier()


IMHO, the right solution is to split the callbacks into 2 or more
notifier list. Then we might rework panic() to do:

void panic(void)
{
	[...]

	/* stop watchdogs + extra info */
	atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
	atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
	panic_print_sys_info();

	/* crash_kexec + kmsg_dump in configurable order */
	if (!_crash_kexec_post_kmsg_dump) {
		__crash_kexec(NULL);
		smp_send_stop();
	} else {
		crash_smp_send_stop();
	}

	kmsg_dump();
	if (_crash_kexec_post_kmsg_dump)
		__crash_kexec(NULL);

	/* infinite loop or reboot */
	atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
	atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);

	console_flush_on_panic(CONSOLE_FLUSH_PENDING);

	if (panic_timeout >= 0) {
		timeout();
		emergency_restart();
	}

	for (i = 0; ; i += PANIC_TIMER_STEP) {
		if (i >= i_next) {
			i += panic_blink(state ^= 1);
			i_next = i + 3600 / PANIC_BLINK_SPD;
		}
		mdelay(PANIC_TIMER_STEP);
	}
}

Two notifier lists might be enough in the above scenario. I would call
them:

	panic_pre_dump_notifier_list
	panic_post_dump_notifier_list


It is a real solution that will help everyone. It is more complicated now
but it will makes things much easier in the long term. And it might be done
step by step:

     1. introduce the two notifier lists
     2. convert all users: one by one
     3. remove the original notifier list when there is no user

Best Regards,
Petr