linux-kernel - Re: Linux 3.1-rc9

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111018182046.GF1309@hostway.ca>
Date:	Tue, 18 Oct 2011 11:20:46 -0700
From:	Simon Kirby <sim@...tway.ca>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: Linux 3.1-rc9

On Tue, Oct 18, 2011 at 11:05:13AM +0200, Peter Zijlstra wrote:

> Subject: cputimer: Cure lock inversion
> From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Date: Mon Oct 17 11:50:30 CEST 2011
> 
> There's a lock inversion between the cputimer->lock and rq->lock; notably
> the two callchains involved are:
> 
>  update_rlimit_cpu()
>    sighand->siglock
>    set_process_cpu_timer()
>      cpu_timer_sample_group()
>        thread_group_cputimer()
>          cputimer->lock
>          thread_group_cputime()
>            task_sched_runtime()
>              ->pi_lock
>              rq->lock
> 
>  scheduler_tick()
>    rq->lock
>    task_tick_fair()
>      update_curr()
>        account_group_exec()
>          cputimer->lock
> 
> Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
> the second one is keeping up-to-date.
> 
> This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
> SMP accounting oddities").
> 
> Cure the problem by removing the cputimer->lock and rq->lock nesting,
> this leaves concurrent enablers doing duplicate work, but the time
> wasted should be on the same order otherwise wasted spinning on the
> lock and the greater-than assignment filter should ensure we preserve
> monotonicity.
> 
> Reported-by: Dave Jones <davej@...hat.com>
> Reported-by: Simon Kirby <sim@...tway.ca>
> Cc: stable@...nel.org
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> ---
>  kernel/posix-cpu-timers.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> Index: linux-2.6/kernel/posix-cpu-timers.c
> ===================================================================
> --- linux-2.6.orig/kernel/posix-cpu-timers.c
> +++ linux-2.6/kernel/posix-cpu-timers.c
> @@ -274,9 +274,7 @@ void thread_group_cputimer(struct task_s
>  	struct task_cputime sum;
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&cputimer->lock, flags);
>  	if (!cputimer->running) {
> -		cputimer->running = 1;
>  		/*
>  		 * The POSIX timer interface allows for absolute time expiry
>  		 * values through the TIMER_ABSTIME flag, therefore we have
> @@ -284,8 +282,11 @@ void thread_group_cputimer(struct task_s
>  		 * it.
>  		 */
>  		thread_group_cputime(tsk, &sum);
> +		spin_lock_irqsave(&cputimer->lock, flags);
> +		cputimer->running = 1;
>  		update_gt_cputime(&cputimer->cputime, &sum);
> -	}
> +	} else
> +		spin_lock_irqsave(&cputimer->lock, flags);
>  	*times = cputimer->cputime;
>  	spin_unlock_irqrestore(&cputimer->lock, flags);
>  }
> 

Tested-by: Simon Kirby <sim@...tway.ca>

Looks good running on three boxes since this morning (unpatched kernel
hangs in ~15 minutes).

While I have your eyes, does this hang trace make any sense (which
happened a couple of times with your previous patch applied)?

http://0x.ca/sim/ref/3.1-rc9/3.1-rc9-tcp-lockup.log

I don't see how all CPUs could be spinning on the same lock without
reentry, and I don't see the any in the backtraces.

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/