[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120106104658.GH27881@csn.ul.ie>
Date: Fri, 6 Jan 2012 10:46:58 +0000
From: Mel Gorman <mel@....ul.ie>
To: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Russell King - ARM Linux <linux@....linux.org.uk>,
KOSAKI Motohiro <kosaki.motohiro@...il.com>,
Gilad Ben-Yossef <gilad@...yossef.com>,
linux-kernel@...r.kernel.org, Chris Metcalf <cmetcalf@...era.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Frederic Weisbecker <fweisbec@...il.com>, linux-mm@...ck.org,
Pekka Enberg <penberg@...nel.org>,
Matt Mackall <mpm@...enic.com>,
Sasha Levin <levinsasha928@...il.com>,
Rik van Riel <riel@...hat.com>,
Andi Kleen <andi@...stfloor.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Greg KH <gregkh@...e.de>, linux-fsdevel@...r.kernel.org,
Avi Kivity <avi@...hat.com>
Subject: Re: [PATCH v5 7/8] mm: Only IPI CPUs to drain local pages if they
exist
On Fri, Jan 06, 2012 at 11:36:11AM +0530, Srivatsa S. Bhat wrote:
> On 01/06/2012 03:51 AM, Mel Gorman wrote:
>
> > (Adding Greg to cc to see if he recalls seeing issues with sysfs dentry
> > suffering from recursive locking recently)
> >
> > On Thu, Jan 05, 2012 at 10:35:04AM -0800, Paul E. McKenney wrote:
> >> On Thu, Jan 05, 2012 at 04:35:29PM +0000, Russell King - ARM Linux wrote:
> >>> On Thu, Jan 05, 2012 at 04:17:39PM +0000, Mel Gorman wrote:
> >>>> Link please?
> >>>
> >>> Forwarded, as its still in my mailbox.
> >>>
> >>>> I'm including a patch below under development that is
> >>>> intended to only cope with the page allocator case under heavy memory
> >>>> pressure. Currently it does not pass testing because eventually RCU
> >>>> gets stalled with the following trace
> >>>>
> >>>> [ 1817.176001] [<ffffffff810214d7>] arch_trigger_all_cpu_backtrace+0x87/0xa0
> >>>> [ 1817.176001] [<ffffffff810c4779>] __rcu_pending+0x149/0x260
> >>>> [ 1817.176001] [<ffffffff810c48ef>] rcu_check_callbacks+0x5f/0x110
> >>>> [ 1817.176001] [<ffffffff81068d7f>] update_process_times+0x3f/0x80
> >>>> [ 1817.176001] [<ffffffff8108c4eb>] tick_sched_timer+0x5b/0xc0
> >>>> [ 1817.176001] [<ffffffff8107f28e>] __run_hrtimer+0xbe/0x1a0
> >>>> [ 1817.176001] [<ffffffff8107f581>] hrtimer_interrupt+0xc1/0x1e0
> >>>> [ 1817.176001] [<ffffffff81020ef3>] smp_apic_timer_interrupt+0x63/0xa0
> >>>> [ 1817.176001] [<ffffffff81449073>] apic_timer_interrupt+0x13/0x20
> >>>> [ 1817.176001] [<ffffffff8116c135>] vfsmount_lock_local_lock+0x25/0x30
> >>>> [ 1817.176001] [<ffffffff8115c855>] path_init+0x2d5/0x370
> >>>> [ 1817.176001] [<ffffffff8115eecd>] path_lookupat+0x2d/0x620
> >>>> [ 1817.176001] [<ffffffff8115f4ef>] do_path_lookup+0x2f/0xd0
> >>>> [ 1817.176001] [<ffffffff811602af>] user_path_at_empty+0x9f/0xd0
> >>>> [ 1817.176001] [<ffffffff81154e7b>] vfs_fstatat+0x4b/0x90
> >>>> [ 1817.176001] [<ffffffff81154f4f>] sys_newlstat+0x1f/0x50
> >>>> [ 1817.176001] [<ffffffff81448692>] system_call_fastpath+0x16/0x1b
> >>>>
> >>>> It might be a separate bug, don't know for sure.
> >>
> >
> > I rebased the patch on top of 3.2 and tested again with a bunch of
> > debugging options set (PROVE_RCU, PROVE_LOCKING etc). Same results. CPU
> > hotplug is a lot more reliable and less likely to hang but eventually
> > gets into trouble.
> >
>
> I was running some CPU hotplug stress tests recently and found it to be
> problematic too. Mel, I have some logs from those tests which appear very
> relevant to the "IPI to offline CPU" issue that has been discussed in this
> thread.
>
> Kernel: 3.2-rc7
> Here is the log:
> (Unfortunately I couldn't capture the log intact, due to some annoying
> serial console issues, but I hope this log is good enough to analyze.)
>
Ok, it looks vaguely similar to what I'm seeing. I think I spotted
the sysfs problem as well and am testing a series. I'll add you to
the cc if it passes tests locally.
Thanks.
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists