lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250410210507.GD15280@redhat.com>
Date: Thu, 10 Apr 2025 23:05:08 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Tze-nan Wu <Tze-nan.Wu@...iatek.com>
Cc: Christian Brauner <brauner@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	wsd_upstream@...iatek.com, bobule.chang@...iatek.com,
	Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	chenqiwu <chenqiwu@...omi.com>, linux-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org,
	linux-mediatek@...ts.infradead.org
Subject: Re: [RFC PATCH] exit: Skip panic in do_exit() during poweroff

Well...

Let me repeat. I don't understand the kernel/reboot.c paths, you can
safely ignore me.

But I still think that you target the wrong goal. Quite possibly I am
wrong.

On 04/10, Tze-nan Wu wrote:
>
> If PID 1 exits due to the unreliable userspace after kernel_power_off()
> invoked,

Why. Why the global init does do_exit()? It should not, that is all.
It doesn't matter if it is single threaded or not.

As for sys_reboot(), I think that kernel_power_off() must be __noreturn,
and sys_reboot() should use BUG() after LINUX_REBOOT_CMD_POWER_OFF/_HALT
instead of do_exit().

If nothing else. do_exit() also does debug_check_no_locks_held() and
sys_reboot() calls do_exit() with system_transition_mutex held.

IOW. IMO, it is not that do_exit() needs some changes. The very fact
that the global init does do_exit() is wrong, this should be fixed.

But again, again, I can't really comment.

Oleg.

> the panic follow by the last thread of global init exited in
> do_exit() will stop the kernel_power_off() procedure, turn a shutdown
> behavior into panic flow(reboot).
>
> Add a condition check to ensure that the panic triggered by the last
> thread of the global init exiting, only occurs while:
> ( system_state != SYSTEM_POWER_OFF and system_state != SYSTEM_RESTART).
> Otherwise, WARN() instead.
>
> [On Android 16 with arm64 arch]
> Here's a scenario where the global init exits during kernel_power_off:
> If PID 1 encounters a page fault after kernel_power_off() has been
> invoked, the kernel will fail to handle the page fault because the
> disk(UFS) has already shut down.
> Consequently, the kernel will send a SIGBUS to PID 1 to indicate the
> page fault failure, and ultimately, the panic will occur after PID 1
> exits due to receiving the SIGBUS.
>
>             cpu1                           cpu2
>           ----------                     ----------
>     kernel_power_off() start
>         UFS shutdown
>             ...                	       PID 1 page fault
>             ...                    page fault handle failure
>             ...			             PID 1 received SIGBUS
>             ...                             panic
>    kernel_power_off() not done
>
> Backtrace while PID 1 received signal 7:
>    init-1 [007] d..1 41239.922385: \
>       signal_generate: sig=7 errno=0 code=2 comm=init pid=1 grp=0 res=0
>    init-1 [007] d..1 41239.922389: kernel_stack: <stack trace>
>    => __send_signal_locked
>    => send_signal_locked
>    => force_sig_info_to_task
>    => force_sig_fault
>    => arm64_force_sig_fault
>    => do_page_fault
>    => do_translation_fault
>    => do_mem_abort
>    => el0_ia
>    => el0t_64_sync_handler
>
> Simplified kernel log:
> kernel_power_off() invoked by pt_notify_thread.
> [41239.526109] pt_notify_threa: reboot set flag, old value 0x********,
> *.
> [41239.526114] pt_notify_threa: reboot set flag new value 0x********.
> UFS reject I/O after kerenl_power_off.
> [41239.686411]  scsi +scsi******** apexd: sd* ******** rejecting I/O to
> offline device.
> Lots of I/O error & erofs error happened after kernel_power_off().
> [41239.690312] apexd: I/O error, dev sdc, sector ******* op ***:(READ)
> flags 0x**** phys_seg ** prio class 0.
> [41239.690465] apexd: I/O error, dev sdc, sector ******* op ***:(READ)
> flags 0x**** phys_seg ** prio class 0.
> ...
> ...
> [41239.922265] init: erofs: (device ****): z_erofs_read_folio: read
> error * @ *** of nid ********.
> [41239.922341] init: erofs: (device ****): z_erofs_read_folio: read
> error * @ *** of nid ********.
> Finally device panic due to PID 1 received SIGBUS.
> [41239.923789] init: Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x00000007
>
> Fixes: 43cf75d96409 ("exit: panic before exit_mm() on global init exit")
> Link: https://lore.kernel.org/all/20191219104223.xvk6ppfogoxrgmw6@wittgenstein/
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@...iatek.com>
> ---
>
> I am also wondering if this patch is reasonable?
>
> From my perspective, there are two reasons not to trigger such panic
> during kernel_power_off() or kernel_restart():
>   1. It is not worthwhile to interrupt kernel_power_off() by a panic
>      resulted from userspace instability.
>   2. The panic in do_exit() was originally designed to ensure a usable
>      coredump if the last thread of the global init process exited.
> 	 However, capture a coredump triggered by userspace crash after
>      kernel_power_off() seems not particularly useful, in my opinion.
>
> In certain scenarios, a kernel module may need to directly power off
> from kernel space to protect hardware (e.g., thermal protection).
> In my opinion, rather than causing a panic during kernel_power_off(),
> it sounds better to allow the device to complete its power-off process.
>
> Appreciate for any comment on this, if there's any better way to
> handle this panic, please point me out.
>
> ---
>  kernel/exit.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 1dcddfe537ee..23cb6b42a1f1 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -901,11 +901,17 @@ void __noreturn do_exit(long code)
>  	if (group_dead) {
>  		/*
>  		 * If the last thread of global init has exited, panic
> -		 * immediately to get a useable coredump.
> +		 * immediately to get a usable coredump, except when the
> +		 * device is currently powering off or restarting.
>  		 */
> -		if (unlikely(is_global_init(tsk)))
> -			panic("Attempted to kill init! exitcode=0x%08x\n",
> -				tsk->signal->group_exit_code ?: (int)code);
> +		if (unlikely(is_global_init(tsk))) {
> +			if (system_state != SYSTEM_POWER_OFF &&
> +			    system_state != SYSTEM_RESTART)
> +				panic("Attempted to kill init! exitcode=0x%08x\n",
> +				      tsk->signal->group_exit_code ?: (int)code);
> +			WARN(1, "Attempted to kill init! exitcode=0x%08x\n",
> +			     tsk->signal->group_exit_code ?: (int)code);
> +		}
>
>  #ifdef CONFIG_POSIX_TIMERS
>  		hrtimer_cancel(&tsk->signal->real_timer);
> --
> 2.45.2
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ