linux-kernel - Re: rcu: BUG on exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120509135714.GC21152@linux.vnet.ibm.com>
Date:	Wed, 9 May 2012 06:57:14 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Sasha Levin <levinsasha928@...il.com>,
	"linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>, yinghan@...gle.com,
	kosaki.motohiro@...fujitsu.com,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: rcu: BUG on exit_group

On Wed, May 09, 2012 at 01:59:59PM +0900, KAMEZAWA Hiroyuki wrote:
> (2012/05/04 14:33), Paul E. McKenney wrote:
> 
> > On Fri, May 04, 2012 at 06:08:34AM +0200, Sasha Levin wrote:
> >> On Thu, May 3, 2012 at 7:01 PM, Paul E. McKenney
> >> <paulmck@...ux.vnet.ibm.com> wrote:
> >>> On Thu, May 03, 2012 at 05:55:14PM +0200, Sasha Levin wrote:
> >>>> On Thu, May 3, 2012 at 5:41 PM, Paul E. McKenney
> >>>> <paulmck@...ux.vnet.ibm.com> wrote:
> >>>>> On Thu, May 03, 2012 at 10:57:19AM +0200, Sasha Levin wrote:
> >>>>>> Hi Paul,
> >>>>>>
> >>>>>> I've hit a BUG similar to the schedule_tail() one when. It happened
> >>>>>> when I've started fuzzing exit_group() syscalls, and all of the traces
> >>>>>> are starting with exit_group() (there's a flood of them).
> >>>>>>
> >>>>>> I've verified that it indeed BUGs due to the rcu preempt count.
> >>>>>
> >>>>> Hello, Sasha,
> >>>>>
> >>>>> Which version of -next are you using?  I did some surgery on this
> >>>>> yesterday based on some bugs Hugh Dickins tracked down, so if you
> >>>>> are using something older, please move to the current -next.
> >>>>
> >>>> I'm using -next from today (3.4.0-rc5-next-20120503-sasha-00002-g09f55ae-dirty).
> >>>
> >>> Hmmm...  Looking at this more closely, it looks like there really is
> >>> an attempt to acquire a mutex within an RCU read-side critical section,
> >>> which is illegal.  Could you please bisect this?
> >>
> >> Right, the issue is as you described, taking a mutex inside rcu_read_lock().
> >>
> >> The offending commit is (I've cc'ed all parties from it):
> >>
> >> commit adf79cc03092ee4aec70da10e91b05fb8116ac7b
> >> Author: Ying Han <yinghan@...gle.com>
> >> Date:   Thu May 3 15:44:01 2012 +1000
> >>
> >>     memcg: add mlock statistic in memory.stat
> >>
> >> With the issue there being is that in munlock_vma_page(), it now does
> >> a mem_cgroup_begin_update_page_stat() which takes the rcu_read_lock(),
> >> so when the older code that was there previously will try taking a
> >> mutex you'll get a BUG.
> > 
> > Hmmm...  One approach would be to switch from rcu_read_lock() to
> > srcu_read_lock(), though this means carrying the index returned from
> > the srcu_read_lock() to the matching srcu_read_unlock() -- and making
> > the update side use synchronize_srcu() rather than synchronize_rcu().
> > Alternatively, it might be possible to defer acquiring the lock until
> > after exiting the RCU read-side critical section, but I don't know enough
> > about mm to even guess whether this might be possible.
> > 
> > There are probably other approaches as well...
> 
> 
> How about this ?

That looks to me to avoid acquiring the mutex within an RCU read-side
critical section, so good.  I have to defer to you guys on whether the
placement of the mem_cgroup_end_update_page_stat() works.

							Thanx, Paul

> ==
> [PATCH] memcg: fix taking mutex under rcu at munlock
> 
> Following bug was reported because mutex is held under rcu_read_lock().
> 
> [   83.820976] BUG: sleeping function called from invalid context at
> kernel/mutex.c:269
> [   83.827870] in_atomic(): 0, irqs_disabled(): 0, pid: 4506, name: trinity
> [   83.832154] 1 lock held by trinity/4506:
> [   83.834224]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff811a7d87>]
> munlock_vma_page+0x197/0x200
> [   83.839310] Pid: 4506, comm: trinity Tainted: G        W
> 3.4.0-rc5-next-20120503-sasha-00002-g09f55ae-dirty #108
> [   83.849418] Call Trace:
> [   83.851182]  [<ffffffff810e7218>] __might_sleep+0x1f8/0x210
> [   83.854076]  [<ffffffff82d9540a>] mutex_lock_nested+0x2a/0x50
> [   83.857120]  [<ffffffff811b0830>] try_to_unmap_file+0x40/0x2f0
> [   83.860242]  [<ffffffff82d984bb>] ? _raw_spin_unlock_irq+0x2b/0x80
> [   83.863423]  [<ffffffff810e7ffe>] ? sub_preempt_count+0xae/0xf0
> [   83.866347]  [<ffffffff82d984e9>] ? _raw_spin_unlock_irq+0x59/0x80
> [   83.869570]  [<ffffffff811b0caa>] try_to_munlock+0x6a/0x80
> [   83.872667]  [<ffffffff811a7cc6>] munlock_vma_page+0xd6/0x200
> [   83.875646]  [<ffffffff811a7d87>] ? munlock_vma_page+0x197/0x200
> [   83.878798]  [<ffffffff811a7e7f>] munlock_vma_pages_range+0x8f/0xd0
> [   83.882235]  [<ffffffff811a8b8a>] exit_mmap+0x5a/0x160
> 
> This bug was introduced by mem_cgroup_begin/end_update_page_stat()
> which uses rcu_read_lock(). This patch fixes the bug by modifying
> the range of rcu_read_lock().
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> ---
>  mm/mlock.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 2fd967a..05ac10d1 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -123,6 +123,7 @@ void munlock_vma_page(struct page *page)
>  	if (TestClearPageMlocked(page)) {
>  		dec_zone_page_state(page, NR_MLOCK);
>  		mem_cgroup_dec_page_stat(page, MEMCG_NR_MLOCK);
> +		mem_cgroup_end_update_page_stat(page, &locked, &flags);
>  		if (!isolate_lru_page(page)) {
>  			int ret = SWAP_AGAIN;
> 
> @@ -154,8 +155,8 @@ void munlock_vma_page(struct page *page)
>  			else
>  				count_vm_event(UNEVICTABLE_PGMUNLOCKED);
>  		}
> -	}
> -	mem_cgroup_end_update_page_stat(page, &locked, &flags);
> +	} else
> +		mem_cgroup_end_update_page_stat(page, &locked, &flags);
>  }
> 
>  /**
> -- 
> 1.7.4.1
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/