[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.02.1210261147000.2756@ionos>
Date: Fri, 26 Oct 2012 11:55:24 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: "he, bo" <bo.he@...el.com>
cc: linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, yanmin_zhang@...ux.intel.com,
yanmin.zhang@...el.com
Subject: Re: [PATCH] hrtimer:__run_hrtimer races with enqueue_hrtimer
On Fri, 26 Oct 2012, he, bo wrote:
> From: Yanmin Zhang <yanmin.zhang@...el.com>
>
> We hit a kernel panic at __run_hrtimer=>BUG_ON(timer->state != HRTIMER_STATE_CALLBACK).
> <2>[ 10.226053, 3] kernel BUG at /home/android/xiaobing/ymz/r4/hardware/intel/linux-2.6/kernel/hrtimer.c:1228!
>
> Basically, __run_hrtimer has a race with enqueue_hrtimer. When
> __run_hrtimer calls the timer callback fn, another thread might call
> enqueue_hrtimer or hrtimer_start to requeue it, and the timer->state
> is equal to HRTIMER_STATE_CALLBACK|HRTIMER_STATE_ENQUEUED, which
> causes the BUG_ON(timer->state != HRTIMER_STATE_CALLBACK) checking
> fails.
>
> The patch fixes it by checking only bit HRTIMER_STATE_CALLBACK.
This does not fix it. It makes it worse.
> Signed-off-by: Yanmin Zhang <yanmin.zhang@...el.com>
> Reviewed-by: He, Bo <bo.he@...el.com>
> ---
> kernel/hrtimer.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 6db7a5e..6280184 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1235,7 +1235,7 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
> * hrtimer_start_range_ns() or in hrtimer_interrupt()
> */
> if (restart != HRTIMER_NORESTART) {
> - BUG_ON(timer->state != HRTIMER_STATE_CALLBACK);
> + BUG_ON(!(timer->state & HRTIMER_STATE_CALLBACK));
> enqueue_hrtimer(timer, base);
> }
What you are allowing here is enqueueing an already enqueued timer
again. I don't know why this does not explode elsewhere, but that's
probably pure luck. It's not allowed to double enqueue a timer.
So no, this is not a solution. The problem is not in the core timer
code, the problem is in the code which uses that timer.
Your code is returning HRTIMER_RESTART from the timer callback and at
the same time it starts the timer from some other context. That's what
needs to be fixed.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists