linux-kernel - Re: [PATCH] sched: tg_set_cfs_bandwidth() causes rq->lock deadlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140520140155.GS13658@twins.programming.kicks-ass.net>
Date:	Tue, 20 May 2014 16:01:55 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	bsegall@...gle.com
Cc:	Roman Gushchin <klamm@...dex-team.ru>,
	linux-kernel@...r.kernel.org, pjt@...gle.com,
	chris.j.arges@...onical.com, gregkh@...uxfoundation.org,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] sched: tg_set_cfs_bandwidth() causes rq->lock deadlock

On Tue, May 20, 2014 at 03:15:26PM +0200, Peter Zijlstra wrote:
> Which leads us to what I think is a BUG in the current hrtimer code (and
> one wonders why we never hit that), because we drop the cpu_base->lock
> over calling hrtimer::function, hrtimer_start_range_ns() can in fact
> come in and (re)enqueue the timer, if hrtimer::function then returns
> HRTIMER_RESTART, we'll hit that BUG_ON() before trying to enqueue the
> timer once more.

> ---
>  kernel/hrtimer.c     |  9 ++++++---
>  kernel/sched/core.c  | 10 ++++++----
>  kernel/sched/fair.c  | 42 +++---------------------------------------
>  kernel/sched/sched.h |  2 +-
>  4 files changed, 16 insertions(+), 47 deletions(-)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 3ab28993f6e0..28942c65635e 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1273,11 +1273,14 @@ static void __run_hrtimer(struct hrtimer *timer, ktime_t *now)
>  	 * Note: We clear the CALLBACK bit after enqueue_hrtimer and
>  	 * we do not reprogramm the event hardware. Happens either in
>  	 * hrtimer_start_range_ns() or in hrtimer_interrupt()
> +	 *
> +	 * Note: Because we dropped the cpu_base->lock above,
> +	 * hrtimer_start_range_ns() can have popped in and enqueued the timer
> +	 * for us already.
>  	 */
> -	if (restart != HRTIMER_NORESTART) {
> -		BUG_ON(timer->state != HRTIMER_STATE_CALLBACK);
> +	if (restart != HRTIMER_NORESTART &&
> +	    !(timer->state & HRTIMER_STATE_ENQUEUED))
>  		enqueue_hrtimer(timer, base);
> -	}
>  
>  	WARN_ON_ONCE(!(timer->state & HRTIMER_STATE_CALLBACK));
>  

Hmm,. doesn't this also mean its entirely unsafe to call
hrtimer_forward*() from the timer callback, because it might be changing
the time of an already enqueued timer, which would corrupt the rb-tree
order.

Lemme go find a nice way out of this mess, I think I'm responsible for
creating it in the first place :-(

Content of type "application/pgp-signature" skipped