[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100201101142.GE12759@laptop>
Date: Mon, 1 Feb 2010 21:11:42 +1100
From: Nick Piggin <npiggin@...e.de>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Linus Torvalds <torvalds@...ux-foundation.org>,
akpm@...ux-foundation.org, Ingo Molnar <mingo@...e.hu>,
linux-kernel@...r.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
dipankar@...ibm.com, josh@...htriplett.org, dvhltc@...ibm.com,
niv@...ibm.com, tglx@...utronix.de, Valdis.Kletnieks@...edu,
dhowells@...hat.com
Subject: Re: [patch 2/3] scheduler: add full memory barriers upon task
switch at runqueue lock/unlock
On Mon, Feb 01, 2010 at 10:42:30AM +0100, Peter Zijlstra wrote:
> On Mon, 2010-02-01 at 18:33 +1100, Nick Piggin wrote:
> > > Adds no overhead on x86, because LOCK-prefixed atomic operations of the spin
> > > lock/unlock already imply a full memory barrier. Combines the spin lock
> > > acquire/release barriers with the full memory barrier to diminish the
> > > performance impact on other architectures. (per-architecture spinlock-mb.h
> > > should be gradually implemented to replace the generic version)
> >
> > It does add overhead on x86, as well as most other architectures.
> >
> > This really seems like the wrong optimisation to make, especially
> > given that there's not likely to be much using librcu yet, right?
> >
> > I'd go with the simpler and safer version of sys_membarrier that does
> > not do tricky synchronisation or add overhead to the ctxsw fastpath.
> > Then if you see some actual improvement in a real program using librcu
> > one day we can discuss making it faster.
> >
> > As it is right now, the change will definitely slow down everybody
> > not using librcu (ie. nearly everything).
>
> Right, so the problem with the 'slow'/'safe' version is that it takes
> rq->lock for all relevant rqs. This renders while (1) sys_membarrier()
> in a quite effective DoS.
All, but one at a time, no? How much of a DoS really is taking these
locks for a handful of cycles each, per syscall?
I mean, we have LOTS of syscalls that take locks, and for a lot longer,
(look at dcache_lock).
I think we basically just have to say that locking primitives should be
somewhat fair, and not be held for too long, it should more or less
work.
If the locks are getting contended, then the threads calling
sys_membarrier are going to be spinning longer too, using more CPU time,
and will get scheduled away...
If there is some particular problem on -rt because of the rq locks,
then I guess you could consider whether to add more overhead to your
ctxsw path to reduce the problem, or simply not support sys_membarrier
for unprived users in the first place.
> Now, I'm not quite charmed by all this. Esp. this patch seems wrong. The
> fact is on x86 we have all the required membarriers in place.
>
> There's a number of LOCK ins before we set rq->curr and we have them
> after. Adding more, like this patch effectively does
> (smp_mb__{before,after}_unlock should be a full mb as Nick pointed out)
> doesn't seem like a good idea at all.
>
> And then there's !x86 to consider.
Yep.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists