[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1276764554.27822.26.camel@twins>
Date: Thu, 17 Jun 2010 10:49:14 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: paulmck@...ux.vnet.ibm.com
Cc: Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, tglx@...utronix.de,
daniel.blueman@...il.com, lizf@...fujitsu.com,
miles.lane@...il.com, manfred@...orfullife.com
Subject: Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote:
> Hello, Peter!
>
> Here is the story as I understand it:
>
> o wake_affine() calls task_group() and uses the resulting
> pointer, for example, passing it to effective_load().
>
> This pointer is to a struct task_group, which contains
> a struct rcu_head, which is passed to call_rcu in
> sched_destroy_group(). So some protection really is
> needed -- or is it enough that wake_affine seems to be
> invoked on the current task? If the latter, we would
> need to add a "task == current" check to task_subsys_state().
>
> o task_group() calls task_subsys_state(), returning a pointer to
> the enclosing task_group structure.
>
> o task_subsys_state() returns an rcu_dereference_check()ed
> pointer. The caller must either be in an RCU read-side
> critical section, hold the ->alloc_lock, or hold the
> cgroup lock.
>
> Now wake_affine() appears to be doing load calculations, so it does not
> seem reasonable to acquire the lock. Hence the use of RCU.
>
> So, what should we be doing instead? ;-)
Well, start by writing a sane changlog ;-)
I realise you didn't actually wrote these patches, but you should push
back to the people feeding you these things (esp when you get gems like:
tg = task_group();
rcu_read_unlock();
which is obvious utter garbage).
There's _two_ task_group() users in wake_affine(), at least one should
be covered by the rq->lock we're holding. It should then explain why the
other isn't covered (and which the other is).
It should also explain why using RCU read lock is the right solution,
and doesn't result in funny races. That is, the current changelog reads
like: "It whines, this makes it quiet." -- which I totally distrust
because we already found at least two actual bugs in this area
(sched-cgroup rcu usage).
That said, the two patches together might not be wrong, but its very
hard to verify without more information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists