lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1276764554.27822.26.camel@twins>
Date:	Thu, 17 Jun 2010 10:49:14 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, tglx@...utronix.de,
	daniel.blueman@...il.com, lizf@...fujitsu.com,
	miles.lane@...il.com, manfred@...orfullife.com
Subject: Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes

On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote:

> Hello, Peter!
> 
> Here is the story as I understand it:
> 
> o	wake_affine() calls task_group() and uses the resulting
> 	pointer, for example, passing it to effective_load().
> 
> 	This pointer is to a struct task_group, which contains
> 	a struct rcu_head, which is passed to call_rcu in
> 	sched_destroy_group().  So some protection really is
> 	needed -- or is it enough that wake_affine seems to be
> 	invoked on the current task?  If the latter, we would
> 	need to add a "task == current" check to task_subsys_state().
> 
> o	task_group() calls task_subsys_state(), returning a pointer to
> 	the enclosing task_group structure.
> 
> o	task_subsys_state() returns an rcu_dereference_check()ed
> 	pointer.  The caller must either be in an RCU read-side
> 	critical section, hold the ->alloc_lock, or hold the
> 	cgroup lock.
> 
> Now wake_affine() appears to be doing load calculations, so it does not
> seem reasonable to acquire the lock.  Hence the use of RCU.
> 
> So, what should we be doing instead?  ;-)

Well, start by writing a sane changlog ;-)

I realise you didn't actually wrote these patches, but you should push
back to the people feeding you these things (esp when you get gems like:

  tg = task_group();
  rcu_read_unlock();

which is obvious utter garbage).

There's _two_ task_group() users in wake_affine(), at least one should
be covered by the rq->lock we're holding. It should then explain why the
other isn't covered (and which the other is).

It should also explain why using RCU read lock is the right solution,
and doesn't result in funny races. That is, the current changelog reads
like: "It whines, this makes it quiet." -- which I totally distrust
because we already found at least two actual bugs in this area
(sched-cgroup rcu usage).

That said, the two patches together might not be wrong, but its very
hard to verify without more information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ