lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <082A11A1-404B-4FFE-B6DC-64A543CA61AD@infradead.org>
Date: Mon, 16 Dec 2024 17:41:27 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>
CC: Stefan Hajnoczi <stefanha@...hat.com>, Jason Wang <jasowang@...hat.com>,
 "x86@...nel.org" <x86@...nel.org>, hpa <hpa@...or.com>,
 dyoung <dyoung@...hat.com>, kexec <kexec@...ts.infradead.org>,
 linux-ext4 <linux-ext4@...r.kernel.org>,
 "Michael S. Tsirkin" <mst@...hat.com>,
 Stefano Garzarella <sgarzare@...hat.com>, eperezma <eperezma@...hat.com>,
 Paolo Bonzini <bonzini@...hat.com>, ming.lei@...hat.com,
 Petr Mladek <pmladek@...e.com>, John Ogness <jogness@...utronix.de>
Subject: Re: [PATCH] sched: Prevent rescheduling when interrupts are disabled

On 16 December 2024 13:20:56 GMT, Thomas Gleixner <tglx@...utronix.de> wrote:
>David reported a warning observed while loop testing kexec jump:
>
>  Interrupts enabled after irqrouter_resume+0x0/0x50
>  WARNING: CPU: 0 PID: 560 at drivers/base/syscore.c:103 syscore_resume+0x18a/0x220
>   kernel_kexec+0xf6/0x180
>   __do_sys_reboot+0x206/0x250
>   do_syscall_64+0x95/0x180
>
>The corresponding interrupt flag trace:
>
>  hardirqs last  enabled at (15573): [<ffffffffa8281b8e>] __up_console_sem+0x7e/0x90
>  hardirqs last disabled at (15580): [<ffffffffa8281b73>] __up_console_sem+0x63/0x90
>
>That means __up_console_sem() was invoked with interrupts enabled. Further
>instrumentation revealed that in the interrupt disabled section of kexec
>jump one of the syscore_suspend() callbacks woke up a task, which set the
>NEED_RESCHED flag. A later callback in the resume path invoked
>cond_resched() which in turn led to the invocation of the scheduler:
>
>  __cond_resched+0x21/0x60
>  down_timeout+0x18/0x60
>  acpi_os_wait_semaphore+0x4c/0x80
>  acpi_ut_acquire_mutex+0x3d/0x100
>  acpi_ns_get_node+0x27/0x60
>  acpi_ns_evaluate+0x1cb/0x2d0
>  acpi_rs_set_srs_method_data+0x156/0x190
>  acpi_pci_link_set+0x11c/0x290
>  irqrouter_resume+0x54/0x60
>  syscore_resume+0x6a/0x200
>  kernel_kexec+0x145/0x1c0
>  __do_sys_reboot+0xeb/0x240
>  do_syscall_64+0x95/0x180
>
>This is a long standing problem, which probably got more visible with
>the recent printk changes. Something does a task wakeup and the
>scheduler sets the NEED_RESCHED flag. cond_resched() sees it set and
>invokes schedule() from a completely bogus context. The scheduler
>enables interrupts after context switching, which causes the above
>warning at the end.
>
>Quite some of the code paths in syscore_suspend()/resume() can result in
>triggering a wakeup with the exactly same consequences. They might not
>have done so yet, but as they share a lot of code with normal operations
>it's just a question of time.
>
>The problem only affects the PREEMPT_NONE and PREEMPT_VOLUNTARY scheduling
>models. Full preemption is not affected as cond_resched() is disabled and
>the preemption check preemptible() takes the interrupt disabled flag into
>account.
>
>Cure the problem by adding a corresponding check into cond_resched().
>
>Reported-by: David Woodhouse <dwmw2@...radead.org>
>Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>Tested-by: David Woodhouse <dwmw2@...radead.org>
>Cc: stable@...r.kernel.org
>Closes: https://lore.kernel.org/all/7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@infradead.org
>---
> kernel/sched/core.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>--- a/kernel/sched/core.c
>+++ b/kernel/sched/core.c
>@@ -7276,7 +7276,7 @@ void rt_mutex_setprio(struct task_struct
> #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
> int __sched __cond_resched(void)
> {
>-	if (should_resched(0)) {
>+	if (should_resched(0) && !irqs_disabled()) {
> 		preempt_schedule_common();
> 		return 1;
> 	}

Thank you. Slight preference for dwmw@...zon.co.uk as the Reported-by and Tested-by addresses if it's not too late. I'm assuming you will handle this and don't want me to round it up with the kexec bits.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ