linux-kernel - Re: [patch] hrtimers debug patch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 26 Mar 2007 19:02:25 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Michal Piotrowski <michal.k.k.piotrowski@...il.com>
Cc:	Nick Piggin <npiggin@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Mingming Cao <cmm@...ibm.com>, Adrian Bunk <bunk@...sta.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Mariusz Kozlowski <m.kozlowski@...land.pl>,
	Oliver Pinter <oliver.pntr@...il.com>,
	Sid Boyce <g3vbv@...eyonder.co.uk>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [patch] hrtimers debug patch

* Michal Piotrowski <michal.k.k.piotrowski@...il.com> wrote:

> Stardust is down, console log and config attached.

thanks! I have stared at hrtimer.c a few more hours and the good news is 
that i found a narrow SMP race. The bad news is that i dont think it 
could explain your bug symptoms: the worst-case effect of the race 
should be an incorrect timeout on the current CPU - not a KTIME_MAX 
thing like your logs show.

But maybe i didnt think through the effects of the bug well enough, and 
your box has a HT CPU, with HT CPUs being pretty good at triggering 
narrow SMP races - so maybe we are lucky? Fix attached below. Patch is 
build and boot-tested.

	Ingo

------------------>
Subject: [patch] hrtimers: fix reprogramming SMP race
From: Ingo Molnar <mingo@...e.hu>

hrtimer_start() incorrectly set the 'reprogram' flag to 
enqueue_hrtimer(), which should only be 1 if the hrtimer is queued to 
the current CPU.

doing otherwise could result in a reprogramming of the current CPU's 
clockevents device, with a timer that is not queued to it - resulting in 
a bogus next expiry value.

Signed-off-by: Ingo Molnar <mingo@...e.hu>
Needs-to-be-tested-by: Michal Piotrowski <michal.k.k.piotrowski@...il.com>
---
 kernel/hrtimer.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux/kernel/hrtimer.c
===================================================================
--- linux.orig/kernel/hrtimer.c
+++ linux/kernel/hrtimer.c
@@ -844,7 +844,12 @@ hrtimer_start(struct hrtimer *timer, kti

 	timer_stats_hrtimer_set_start_info(timer);

-	enqueue_hrtimer(timer, new_base, base == new_base);
+	/*
+	 * Only allow reprogramming if the new base is on this CPU.
+	 * (it might still be on another CPU if the timer was pending)
+	 */
+	enqueue_hrtimer(timer, new_base,
+			new_base->cpu_base == &__get_cpu_var(hrtimer_bases));

 	unlock_hrtimer_base(timer, &flags);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/