linux-kernel - Re: rcu: BUG on exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120504053331.GA16836@linux.vnet.ibm.com>
Date:	Thu, 3 May 2012 22:33:31 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Sasha Levin <levinsasha928@...il.com>
Cc:	"linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>, yinghan@...gle.com,
	kosaki.motohiro@...fujitsu.com,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: rcu: BUG on exit_group

On Fri, May 04, 2012 at 06:08:34AM +0200, Sasha Levin wrote:
> On Thu, May 3, 2012 at 7:01 PM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
> > On Thu, May 03, 2012 at 05:55:14PM +0200, Sasha Levin wrote:
> >> On Thu, May 3, 2012 at 5:41 PM, Paul E. McKenney
> >> <paulmck@...ux.vnet.ibm.com> wrote:
> >> > On Thu, May 03, 2012 at 10:57:19AM +0200, Sasha Levin wrote:
> >> >> Hi Paul,
> >> >>
> >> >> I've hit a BUG similar to the schedule_tail() one when. It happened
> >> >> when I've started fuzzing exit_group() syscalls, and all of the traces
> >> >> are starting with exit_group() (there's a flood of them).
> >> >>
> >> >> I've verified that it indeed BUGs due to the rcu preempt count.
> >> >
> >> > Hello, Sasha,
> >> >
> >> > Which version of -next are you using?  I did some surgery on this
> >> > yesterday based on some bugs Hugh Dickins tracked down, so if you
> >> > are using something older, please move to the current -next.
> >>
> >> I'm using -next from today (3.4.0-rc5-next-20120503-sasha-00002-g09f55ae-dirty).
> >
> > Hmmm...  Looking at this more closely, it looks like there really is
> > an attempt to acquire a mutex within an RCU read-side critical section,
> > which is illegal.  Could you please bisect this?
> 
> Right, the issue is as you described, taking a mutex inside rcu_read_lock().
> 
> The offending commit is (I've cc'ed all parties from it):
> 
> commit adf79cc03092ee4aec70da10e91b05fb8116ac7b
> Author: Ying Han <yinghan@...gle.com>
> Date:   Thu May 3 15:44:01 2012 +1000
> 
>     memcg: add mlock statistic in memory.stat
> 
> With the issue there being is that in munlock_vma_page(), it now does
> a mem_cgroup_begin_update_page_stat() which takes the rcu_read_lock(),
> so when the older code that was there previously will try taking a
> mutex you'll get a BUG.

Hmmm...  One approach would be to switch from rcu_read_lock() to
srcu_read_lock(), though this means carrying the index returned from
the srcu_read_lock() to the matching srcu_read_unlock() -- and making
the update side use synchronize_srcu() rather than synchronize_rcu().
Alternatively, it might be possible to defer acquiring the lock until
after exiting the RCU read-side critical section, but I don't know enough
about mm to even guess whether this might be possible.

There are probably other approaches as well...

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/