[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.1304151425540.1030@eggly.anvils>
Date: Mon, 15 Apr 2013 14:47:25 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Vivek Goyal <vgoyal@...hat.com>
cc: Michel Lespinasse <walken@...gle.com>,
linux kernel mailing list <linux-kernel@...r.kernel.org>,
Rik van Riel <riel@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on
CPU due to 09a9f1d27
On Mon, 15 Apr 2013, Vivek Goyal wrote:
> On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> > >
> > > [..]
> > > > > My first guess would be that mmap_sem is held during exec, so you
> > > > > can't have __mm_populate() try holding it recursively.
> > > >
> > > > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> > > > and things are fine.
> > > >
> > > > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> > > > VM_POPULATE specifed). I will do git bisect and try to figure out which
> > > > is first commit which has the issue.
> > >
> > > Ok, following seems to be first bad commit.
> > >
> > > commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
> > > Author: Michel Lespinasse <walken@...gle.com>
> > > Date: Fri Feb 22 16:32:37 2013 -0800
> > >
> > > mm: introduce mm_populate() for populating new vmas
> > >
>
> Michel,
>
> An interesting observation. After this commit looks like simple
> mmap(MAP_LOCKED) of a file was broken and it would hang and give RCU stall
> warning similar to my patch of locking /sbin/kexec.
>
> But in latest kernel mmap(MAP_LOCKED) does not hang. So looks like
> this problem got fixed in a patch after this first bad commit. But
> locking /sbin/kexec issue still remains.
I haven't tried to understand that. But I did just try your
def_flags |= VM_LOCKED hack to fs/binfmt_elf.c, and CONFIG_DEBUG_VM=y
quickly suggested the patch below - without the BUG, yes, __mm_populate
might well loop forever trying to populate 0 pages.
Whether a fix is actually needed, and whether it should be fixed here
or elsewhere, I'll leave to Michel.
Hugh
--- 3.9-rc7/mm/mlock.c 2013-04-01 09:08:05.736012852 -0700
+++ linux/mm/mlock.c 2013-04-15 14:20:24.454773245 -0700
@@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
long ret = 0;
VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(len != PAGE_ALIGN(len));
- end = start + len;
+ end = start + PAGE_ALIGN(len);
for (nstart = start; nstart < end; nstart = nend) {
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists