lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20110324174803.GA18936@tsunami.ccur.com>
Date:	Thu, 24 Mar 2011 13:48:03 -0400
From:	Joe Korty <joe.korty@...r.com>
To:	paulmck@...ux.vnet.ibm.com
Cc:	fweisbec@...il.com, peterz@...radead.org, laijs@...fujitsu.com,
	mathieu.desnoyers@...icios.com, dhowells@...hat.com,
	loic.minier@...aro.org, dhaval.giani@...il.com, tglx@...utronix.de,
	josh@...htriplett.org, houston.jim@...cast.net,
	andi@...stfloor.org, linux-kernel@...r.kernel.org
Subject: [PATCH 19/24] jrcu: bugfix: init cpu wait state on every scan

jrcu: re-init cpu wait state on every scan, not just at
scans that mark beginning of batch.

This fixes a hard to hit bug.  To have a chance of hitting
it, these conditions must be true: we have a cpu running a
user application 100% of the time, not making any system
calls, and no interrupts of any type being delivered to
that cpu.

jrcu is designed to allow transitioning values of every
description to be fuzzy for a while before settling down.
Therefore, if a batch ends (and a new one starts) at about
the time a cpu is transitioning from a normal state to the
above mentioned user-dedicated state, the value the cpu
->wait state is set to will be somewhat random.  That is,
most of the time it will be correct but on occasion it
will take the opposite value.  This is OK, it is expected,
but for things to work we must periodically re-sample and
re-init the ->wait state so that later on, we will catch
the sampled value again, after it has become stable.

Without periodic re-sampling we could set ->wait =1 when it
should be =0, and once it is =1 it will stay =1 (because a
user-dedicated cpu crosses no quiescent point taps which
by definition would set ->wait =0).  JRCU thus stops
advancing batches until the watchdog fires and tickles
the offending cpu.

Signed-off-by: Joe Korty <joe.korty@...r.com>

Index: b/kernel/jrcu.c
===================================================================
--- a/kernel/jrcu.c
+++ b/kernel/jrcu.c
@@ -319,8 +319,11 @@ static void __rcu_delimit_batches(struct
 	for_each_online_cpu(cpu) {
 		rd = &rcu_data[cpu];
 		if (rd->wait) {
-			eob = 0;
-			break;
+			rd->wait = preempt_count_cpu(cpu) > idle_cpu(cpu);
+			if (rd->wait) {
+				eob = 0;
+				break;
+			}
 		}
 	}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ