linux-kernel - Re: [PATCH] panic, kexec: Don't mutex_trylock() in __crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YqyHGxPCgiXuep3/@alley>
Date:   Fri, 17 Jun 2022 15:52:27 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Valentin Schneider <vschneid@...hat.com>
Cc:     Tao Zhou <tao.zhou@...ux.dev>, linux-kernel@...r.kernel.org,
        kexec@...ts.infradead.org, linux-rt-users@...r.kernel.org,
        Eric Biederman <ebiederm@...ssion.com>,
        Arnd Bergmann <arnd@...db.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Juri Lelli <jlelli@...hat.com>,
        "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>
Subject: Re: [PATCH] panic, kexec: Don't mutex_trylock() in __crash_kexec()

On Fri 2022-06-17 12:52:05, Valentin Schneider wrote:
> Hi Tao,
> 
> On 17/06/22 18:42, Tao Zhou wrote:
> > Hi Valentin,
> >
> > On Thu, Jun 16, 2022 at 01:37:09PM +0100, Valentin Schneider wrote:
> >> @@ -964,24 +966,31 @@ late_initcall(kexec_core_sysctl_init);
> >>   */
> >>  void __noclone __crash_kexec(struct pt_regs *regs)
> >>  {
> >> -	/* Take the kexec_mutex here to prevent sys_kexec_load
> >> -	 * running on one cpu from replacing the crash kernel
> >> -	 * we are using after a panic on a different cpu.
> >> +	/*
> >> +	 * This should be taking kexec_mutex before doing anything with the
> >> +	 * kexec_crash_image, but this code can be run in NMI context which
> >> +	 * means we can't even trylock.
> >>  	 *
> >> -	 * If the crash kernel was not located in a fixed area
> >> -	 * of memory the xchg(&kexec_crash_image) would be
> >> -	 * sufficient.  But since I reuse the memory...
> >> +	 * Pairs with smp_mb() in do_kexec_load() and sys_kexec_file_load()
> >>  	 */
> >> -	if (mutex_trylock(&kexec_mutex)) {
> >> -		if (kexec_crash_image) {
> >> -			struct pt_regs fixed_regs;
> >> -
> >> -			crash_setup_regs(&fixed_regs, regs);
> >> -			crash_save_vmcoreinfo();
> >> -			machine_crash_shutdown(&fixed_regs);
> >> -			machine_kexec(kexec_crash_image);
> >> -		}
> >> -		mutex_unlock(&kexec_mutex);
> >> +	WRITE_ONCE(panic_wants_kexec, true);
> >> +	smp_mb();
> >> +	/*
> >> +	 * If we're panic'ing while someone else is messing with the crash
> >> +	 * kernel, this isn't going to end well.
> >> +	 */
> >> +	if (READ_ONCE(kexec_loading)) {
> >> +		WRITE_ONCE(panic_wants_kexec, false);
> >> +		return;
> >> +	}
> >
> > So this is from NMI. The mutex guarantee that kexec_file_load() or 
> > do_kexec_load() just one of them beat on cpu. NMI can happen on more
> > than one cpu. That means that here be cumulativity here IMHO.
> >
> 
> If you look at __crash_kexec() in isolation yes, but if you look at panic()
> and nmi_panic() only a single NMI can get in there. I think that is also
> true for invocations via crash_kexec().

It is true that panic() could be called only once, see this code
in panic():

	 * Only one CPU is allowed to execute the panic code from here. For
	this_cpu = raw_smp_processor_id();
	old_cpu  = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);

	if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
		panic_smp_self_stop();


One the other hand, the aproach with two variables is strage
and brings exactly these questions.

If a trylock is enough that the mutex can be replaced by two
simple atomic operations. The mutex would be needed only
when a user really would need to wait for another one.


	atomic_t crash_kexec_lock;

	/* trylock part */
	if (atomic_cmpxchg_acquire(&crash_kexec_lock, 0, 1) != 0)
		return -EBUSY;

	/* do anything guarded by crash_kexec_lock */

	/* release lock */
	atomic_set_release(&crash_kexec_lock, 0);

The _acquire, _release variants will do the barriers correctly.

Best Regards,
Petr