[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260113140234.677117-1-clm@meta.com>
Date: Tue, 13 Jan 2026 06:02:27 -0800
From: Chris Mason <clm@...a.com>
To: Breno Leitao <leitao@...ian.org>
CC: Chris Mason <clm@...a.com>, Alexander Potapenko <glider@...gle.com>,
"Marco Elver" <elver@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>,
Andrew
Morton <akpm@...ux-foundation.org>,
<kasan-dev@...glegroups.com>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, <kernel-team@...a.com>,
<stable@...r.kernel.org>
Subject: Re: [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown
On Thu, 27 Nov 2025 06:51:54 -0800 Breno Leitao <leitao@...ian.org> wrote:
> During system shutdown, KFENCE can cause IPI synchronization issues if
> it remains active through the reboot process. To prevent this, register
> a reboot notifier that disables KFENCE and cancels any pending timer
> work early in the shutdown sequence.
>
Hi Breno,
I'm running v6.19 MM commits through the patch review automation and this
one was flagged. Is this deadlock possible?
Automated review output:
> @@ -820,6 +821,25 @@ static struct notifier_block kfence_check_canary_notifier = {
> static struct delayed_work kfence_timer;
>
> #ifdef CONFIG_KFENCE_STATIC_KEYS
> +static int kfence_reboot_callback(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + /*
> + * Disable kfence to avoid static keys IPI synchronization during
> + * late shutdown/kexec
> + */
> + WRITE_ONCE(kfence_enabled, false);
> + /* Cancel any pending timer work */
> + cancel_delayed_work_sync(&kfence_timer);
^^^^^^^^^^^^^^^
Can cancel_delayed_work_sync() deadlock here?
If toggle_allocation_gate() is currently executing and blocked inside
wait_event_idle() (waiting for kfence_allocation_gate > 0), then
cancel_delayed_work_sync() will block forever waiting for the work to
complete.
The wait_event_idle() condition depends only on allocations occurring
to increment kfence_allocation_gate - setting kfence_enabled to false
does not wake up this wait. During shutdown when allocations may have
stopped, the work item could remain blocked indefinitely, causing the
reboot notifier to hang.
The call chain is:
kfence_reboot_callback()
-> cancel_delayed_work_sync(&kfence_timer)
-> __flush_work()
-> wait_for_completion(&barr.done)
// waits forever because...
toggle_allocation_gate() [currently running]
-> wait_event_idle(allocation_wait, kfence_allocation_gate > 0)
// never wakes up if no allocations happen
Would it be safer to use cancel_delayed_work() (non-sync) here, or add
a mechanism to wake up the wait_event_idle() when kfence_enabled becomes
false?
Powered by blists - more mailing lists