[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-B6ob0zLZr81e8i@gmail.com>
Date: Sun, 23 Mar 2025 22:18:25 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Fernando Fernandez Mancera <ffmancera@...eup.net>
Cc: x86@...nel.org, tglx@...utronix.de, linux-kernel@...r.kernel.org,
dwmw@...zon.co.uk, mhkelley@...look.com
Subject: Re: [PATCH] x86/i8253: fix possible deadlock when turning off the PIT
* Fernando Fernandez Mancera <ffmancera@...eup.net> wrote:
> As the PIT could be disabled during the init, it can possibly cause a
> deadlock when resuming from suspend due to a lock dependency created at
> pcspkr_event(). This new dependency connect a HARDIRQ-irq-safe to a
> HARDIRQ-irq-unsafe.
>
> Solve this warning by using the raw_spin_lock_irqsave() variant when
> turning off the PIT.
>
> [ 45.408952] =====================================================
> [ 45.408970] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
> [ 45.408974] 6.14.0-rc7+ #6 Not tainted
> [ 45.408978] -----------------------------------------------------
> [ 45.408980] systemd-sleep/3324 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [ 45.408986] ffffffffb2c23398 (i8253_lock){+.+.}-{2:2}, at: pcspkr_event+0x3f/0xe0 [pcspkr]
> [ 45.409004]
> and this task is already holding:
> [ 45.409006] ffff9c334d7c2230 (&dev->event_lock){-.-.}-{3:3}, at: input_dev_resume+0x21/0x50
> [ 45.409023] which would create a new lock dependency:
> [ 45.409025] (&dev->event_lock){-.-.}-{3:3} -> (i8253_lock){+.+.}-{2:2}
> [ 45.409043]
> but this new dependency connects a HARDIRQ-irq-safe lock:
> [ 45.409045] (&dev->event_lock){-.-.}-{3:3}
> [ 45.409052]
> ... which became HARDIRQ-irq-safe at:
> [ 45.409055] lock_acquire+0xd0/0x2f0
> [ 45.409062] _raw_spin_lock_irqsave+0x48/0x70
> [ 45.409067] input_event+0x3c/0x80
> [ 45.409071] atkbd_receive_byte+0x9b/0x6e0
> [ 45.409077] ps2_interrupt+0xb2/0x1d0
> [ 45.409082] serio_interrupt+0x4a/0x90
> [ 45.409087] i8042_handle_data+0xf8/0x280
> [ 45.409091] i8042_interrupt+0x11/0x40
> [ 45.409095] __handle_irq_event_percpu+0x87/0x260
> [ 45.409100] handle_irq_event+0x38/0x90
> [ 45.409105] handle_edge_irq+0x8b/0x230
> [ 45.409109] __common_interrupt+0x5c/0x120
> [ 45.409114] common_interrupt+0x80/0xa0
> [ 45.409120] asm_common_interrupt+0x26/0x40
> [ 45.409125] pv_native_safe_halt+0xf/0x20
> [ 45.409130] default_idle+0x9/0x20
> [ 45.409135] default_idle_call+0x7a/0x1d0
> [ 45.409140] do_idle+0x215/0x260
> [ 45.409144] cpu_startup_entry+0x29/0x30
> [ 45.409149] start_secondary+0x132/0x170
> [ 45.409153] common_startup_64+0x13e/0x141
> [ 45.409158]
> to a HARDIRQ-irq-unsafe lock:
> [ 45.409161] (i8253_lock){+.+.}-{2:2}
> [ 45.409167]
> ... which became HARDIRQ-irq-unsafe at:
> [ 45.409170] ...
> [ 45.409172] lock_acquire+0xd0/0x2f0
> [ 45.409177] _raw_spin_lock+0x30/0x40
> [ 45.409181] clockevent_i8253_disable+0x1c/0x60
> [ 45.409186] pit_timer_init+0x25/0x50
> [ 45.409191] hpet_time_init+0x46/0x50
> [ 45.409196] x86_late_time_init+0x1b/0x40
> [ 45.409201] start_kernel+0x962/0xa00
> [ 45.409206] x86_64_start_reservations+0x24/0x30
> [ 45.409211] x86_64_start_kernel+0xed/0xf0
> [ 45.409215] common_startup_64+0x13e/0x141
> [ 45.409220]
> other info that might help us debug this:
>
> [ 45.409222] Possible interrupt unsafe locking scenario:
>
> [ 45.409224] CPU0 CPU1
> [ 45.409226] ---- ----
> [ 45.409228] lock(i8253_lock);
> [ 45.409234] local_irq_disable();
> [ 45.409237] lock(&dev->event_lock);
> [ 45.409243] lock(i8253_lock);
> [ 45.409249] <Interrupt>
> [ 45.409251] lock(&dev->event_lock);
> [ 45.409257]
> *** DEADLOCK ***
>
> Fixes: 70e6b7d9ae3c ("x86/i8253: Disable PIT timer 0 when not in use")
> Signed-off-by: Fernando Fernandez Mancera <ffmancera@...eup.net>
> ---
> drivers/clocksource/i8253.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/clocksource/i8253.c b/drivers/clocksource/i8253.c
> index 39f7c2d736d1..9a91ce66e16e 100644
> --- a/drivers/clocksource/i8253.c
> +++ b/drivers/clocksource/i8253.c
> @@ -103,7 +103,9 @@ int __init clocksource_i8253_init(void)
> #ifdef CONFIG_CLKEVT_I8253
> void clockevent_i8253_disable(void)
> {
> - raw_spin_lock(&i8253_lock);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&i8253_lock, flags);
>
> /*
> * Writing the MODE register should stop the counter, according to
> @@ -133,7 +135,7 @@ void clockevent_i8253_disable(void)
>
> outb_p(0x30, PIT_MODE);
>
> - raw_spin_unlock(&i8253_lock);
> + raw_spin_unlock_irqrestore(&i8253_lock, flags);
> }
That's a nice one - and in general the clockevent shutdown/disable
handlers are called with IRQs off: hpet_time_init()->pit_timer_init()
being the odd one out post-70e6b7d9ae3c that introduced this bug.
So the real bug is that hpet_time_init() just blindly assumes that
clockevent_i8253_disable() is IRQ-safe, which it isn't and never was.
So the slightly better fix would be to save-disable interrupts in
pit_timer_init() around the clockevent_i8253_disable() call. This would
avoid the IRQ disable/enable dance for all the other callers. (Not that
it really matters for the PIT driver, but it's the principle.)
Thanks,
Ingo
Powered by blists - more mailing lists