[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87o6p4bgzn.ffs@tglx>
Date: Fri, 14 Nov 2025 19:34:36 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Enlin Mu <enlin.mu@...ux.dev>, anna-maria@...utronix.de,
frederic@...nel.org, linux-kernel@...r.kernel.org, enlin.mu@...soc.com,
enlin.mu@...ux.dev
Subject: Re: [PATCH V2] hrtimer: Check running timer state
On Fri, Nov 14 2025 at 19:42, Enlin Mu wrote:
> When the running timer is not NULL, print debugging information.
What for?
> The test code is roughly as follows:
>
> static struct hrtimer serial_timer;
> enum hrtimer_restart serial_timer_handler(struct hrtimer * timer)
> {
> local_irq_disable();
What is this for?
> ......
> do_someting();
> copy_data_with_dma();
> ......
> hrtimer_forward_now(*serial_timer, ns_to_ktime(1000*2000));
> local_irq_enable();
And this?
> return HRTIMER_RESTART;
> }
>
> static int serial_start(struct uart_port *port)
> {
> ......
> ......
> hrtime_init(&serial_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
That function does not exist.
> ktime = no_to_ktime(1000*2000);
> serial_timer.function = serial_timer_handler;
> hrtimer_start(&serial_timer, ktime, HRTIMER_MODE_REL);
> ......
> return 0;
> }
>
> static void serial_shutdown(struct uart_port *port)
> {
> ......
> hrtimer_cancle(&serial_timer);
> ......
> serial_release_dma(port);
> ......
> }
> The cpu6 canceled serial_timer and released dma,but the
> serial_timer still ran many times on CPU7 until a panic occurred.
> The reason for the panic is that serial_timer accessed the
> released dma,But the serial_timer had been canceled for
> some time now on cpu6.
You still fail to explain how the timer can still run after being
canceled.
> The cpu6 can successfully cancel the serial_timer because the
> running timer has changed and it is another timer(such as
> hrtimer_usb).
After that the timer _cannot_ be running anymore unless some other code
re-arms it afterwards.
> When the serial_timer is enable to interrupt, the next hrtimer
> (such as hrtimer_usb) on cpu7 preempts the return of ther serial_timer,
> causing a change in the running timer.
Then fix your timer callback. The callback is invoked in hard interrupt
context and the callback enables interrupts, which is a NONO. You
clearly never ran your code with lockdep enabled. It would have told you
so.
> Signed-off-by: Enlin Mu <enlin.mu@...ux.dev>
> Signed-off-by: Enlin Mu <enlin.mu@...soc.com>
> Signed-off-by: Enlin Mu <enlin.mu@...ux.dev>
Interesting Signed-off-by chain. Seems you're co-developing this patch
with your Alter ego.
Thanks,
tglx
Powered by blists - more mailing lists