lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Aug 2008 06:40:30 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Christoph Lameter <cl@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Ingo Molnar <mingo@...e.hu>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Andi Kleen <andi@...stfloor.org>,
	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] smp_call_function: use rwlocks on queues
	rather?than rcu

On Tue, Aug 26, 2008 at 03:13:40PM +1000, Nick Piggin wrote:
> On Tuesday 26 August 2008 01:46, Christoph Lameter wrote:
> > Peter Zijlstra wrote:
> > > If we combine these two cases, and flip the counter as soon as we've
> > > enqueued one callback, unless we're already waiting for a grace period
> > > to end - which gives us a longer window to collect callbacks.
> > >
> > > And then the rcu_read_unlock() can do:
> > >
> > >   if (dec_and_zero(my_counter) && my_index == dying)
> > >     raise_softirq(RCU)
> > >
> > > to fire off the callback stuff.
> > >
> > > /me ponders - there must be something wrong with that...
> > >
> > > Aaah, yes, the dec_and_zero is non trivial due to the fact that its a
> > > distributed counter. Bugger..
> >
> > Then lets make it per cpu. If we get the cpu ops in then dec_and_zero would
> > be very cheap.
> 
> Let's be very careful before making rcu read locks costly. Any reduction
> in grace periods would be great, but IMO RCU should not be used in cases
> where performance depends on the freed data remaining in cache.

Indeed!

But if you were in a situation where read-side overhead was irrelevant
(perhaps a mythical machine with zero-cost atomics and cache misses),
then one approach would be to combine Oleg Nesterov's QRCU with the
callback processing from Andrea Arcangeli's implementation from the 2001
timeframe.  Of course, if your cache misses really were zero cost,
then you wouldn't care about the data remaining in cache.  So maybe
a machine were cache misses to other CPUs' caches are free, but misses
to main memory are horribly expensive?

Anyway, the trick would be to adapt QRCU (http://lkml.org/lkml/2007/2/25/18)
to store the index in the task structure (as opposed to returning it
from rcu_read_lock()), and have a single global queue of callbacks,
guarded by a global lock.  Then rcu_read_unlock() can initiate callback
processing if the counter decrements down to zero, and call_rcu() would
also initiate a counter switch in the case where the non-current counter
was zero -- and this operation would be guarded by the same lock that
guards the callback queue.

But I doubt that this would be satisfactory on 4,096-CPU machines.
At least not in most cases.  ;-)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ