[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111018182046.GF1309@hostway.ca>
Date: Tue, 18 Oct 2011 11:20:46 -0700
From: Simon Kirby <sim@...tway.ca>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Dave Jones <davej@...hat.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: Linux 3.1-rc9
On Tue, Oct 18, 2011 at 11:05:13AM +0200, Peter Zijlstra wrote:
> Subject: cputimer: Cure lock inversion
> From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Date: Mon Oct 17 11:50:30 CEST 2011
>
> There's a lock inversion between the cputimer->lock and rq->lock; notably
> the two callchains involved are:
>
> update_rlimit_cpu()
> sighand->siglock
> set_process_cpu_timer()
> cpu_timer_sample_group()
> thread_group_cputimer()
> cputimer->lock
> thread_group_cputime()
> task_sched_runtime()
> ->pi_lock
> rq->lock
>
> scheduler_tick()
> rq->lock
> task_tick_fair()
> update_curr()
> account_group_exec()
> cputimer->lock
>
> Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
> the second one is keeping up-to-date.
>
> This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
> SMP accounting oddities").
>
> Cure the problem by removing the cputimer->lock and rq->lock nesting,
> this leaves concurrent enablers doing duplicate work, but the time
> wasted should be on the same order otherwise wasted spinning on the
> lock and the greater-than assignment filter should ensure we preserve
> monotonicity.
>
> Reported-by: Dave Jones <davej@...hat.com>
> Reported-by: Simon Kirby <sim@...tway.ca>
> Cc: stable@...nel.org
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> ---
> kernel/posix-cpu-timers.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
> Index: linux-2.6/kernel/posix-cpu-timers.c
> ===================================================================
> --- linux-2.6.orig/kernel/posix-cpu-timers.c
> +++ linux-2.6/kernel/posix-cpu-timers.c
> @@ -274,9 +274,7 @@ void thread_group_cputimer(struct task_s
> struct task_cputime sum;
> unsigned long flags;
>
> - spin_lock_irqsave(&cputimer->lock, flags);
> if (!cputimer->running) {
> - cputimer->running = 1;
> /*
> * The POSIX timer interface allows for absolute time expiry
> * values through the TIMER_ABSTIME flag, therefore we have
> @@ -284,8 +282,11 @@ void thread_group_cputimer(struct task_s
> * it.
> */
> thread_group_cputime(tsk, &sum);
> + spin_lock_irqsave(&cputimer->lock, flags);
> + cputimer->running = 1;
> update_gt_cputime(&cputimer->cputime, &sum);
> - }
> + } else
> + spin_lock_irqsave(&cputimer->lock, flags);
> *times = cputimer->cputime;
> spin_unlock_irqrestore(&cputimer->lock, flags);
> }
>
Tested-by: Simon Kirby <sim@...tway.ca>
Looks good running on three boxes since this morning (unpatched kernel
hangs in ~15 minutes).
While I have your eyes, does this hang trace make any sense (which
happened a couple of times with your previous patch applied)?
http://0x.ca/sim/ref/3.1-rc9/3.1-rc9-tcp-lockup.log
I don't see how all CPUs could be spinning on the same lock without
reentry, and I don't see the any in the backtraces.
Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists