[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTi=c=+JnU9dQsVhgjZbOtArhHQz+EQ@mail.gmail.com>
Date: Thu, 21 Apr 2011 12:24:58 +0200
From: Sedat Dilek <sedat.dilek@...glemail.com>
To: paulmck@...ux.vnet.ibm.com
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
linux-next@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?)
On Thu, Apr 21, 2011 at 11:07 AM, Sedat Dilek
<sedat.dilek@...glemail.com> wrote:
> On Thu, Apr 21, 2011 at 7:08 AM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
>> On Thu, Apr 14, 2011 at 03:44:11PM -0700, Paul E. McKenney wrote:
>>> On Fri, Apr 15, 2011 at 12:19:34AM +0200, Sedat Dilek wrote:
>>> > On Thu, Apr 14, 2011 at 12:19 PM, Sedat Dilek
>>> > <sedat.dilek@...glemail.com> wrote:
>>> > > On Thu, Apr 14, 2011 at 11:16 AM, Sedat Dilek
>>> > > <sedat.dilek@...glemail.com> wrote:
>>> > >> [ Adding CC to RCU maintainer (Hi Paul :-)) ]
>>> > >>
>>> > >> Helping me for now with (see also Documentation/RCU/stallwarn.txt):
>>> > >>
>>> > >> # cat /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>> > >> 0
>>> > >>
>>> > >> # echo "1" > /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>> > >>
>>> > >> # cat /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>> > >> 1
>>> > >>
>>> > >> - Sedat -
>>> > >>
>>> > >
>>> > > That workaround helped till a system-freeze when generating a tarball
>>> > > from my current kernel-tree.
>>> > > I switched back to my yesterday's linux-next kernel.
>>> > >
>>> > > - Sedat -
>>> > >
>>> >
>>> > I isolated the culprit so far:
>>> >
>>> > commit 900507fc62d5ba0164c07878dbc36ac97866a858
>>> > "rcu: move TREE_RCU from softirq to kthread"
>>> >
>>> > With this revert my system does not show the symptoms I have reported.
>>>
>>> Hmmm... I never was able to reproduce this, but did find a workload
>>> that slowed up the grace periods. I fixed that (which turned out to
>>> be a wakeup problem), but my hopes that it would also fix your problem
>>> were clearly unfounded. I have once again stopped exporting this commit
>>> to -next.
>>
>> I have added some debug tracing, which are available at branch
>> "sedat.2011.04.19a" in the git repository at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
>>
>> Alternatively, if it is easier, the shown below can be used. FWIW,
>> this patch is against 2.6.39-rc3.
>>
>> Either way, if you get a chance to run your tests on this, could you
>> please run the attached script (collectdebugfs.sh) and capture its output?
>> Sample output is attached as well (collectdebugfs.sh.out): the script
>> should output something vaguely like the sample output every 15 seconds
>> or so.
>>
>> The script assumes that debugfs is enabled (along with CONFIG_RCU_TRACE=y)
>> and mounted as follows:
>>
>> mount -t debugfs none /sys/kernel/debug/
>>
>> Or if you mount debugfs somewhere else, please set the script's DEBUGFS_MP
>> variable accordingly.
>>
>> Thanx, Paul
>>
>> ------------------------------------------------------------------------
>>
>
> Welcome to operation "Kill that RCU brainbug" (Starship troopers part X)!
>
> Of course I can help with testing.
>
> Paul, did you see recent RCU-related fixes to fs between rc3 and rc4?
>
> commit c1530019e311c91d14b24d8e74d233152d806e45
> vfs: Fix absolute RCU path walk failures due to uninitialized seq number
>
> fff3e5ade4455a4b42a19c95dd7a167a3cb7956a
> fs: synchronize_rcu when unregister_filesystem success not failure
>
> IIRC, Jens has pending block/plugging patches in his for-linus tree.
> Especially this one (CONFIG_PREEMPT):
>
> 5f45c69589b7d2953584e6cd0b31e35dbe960ad0
> cfq-iosched: read_lock() does not always imply rcu_read_lock()
>
> Some questions to test-scenario:
>
> Shall I test from linux-2.6-rcu.git#sedat.2011.04.19a GIT tree?
> I think that's the ideal solution.
> Or shall I pull sedat.2011.04.19a GIT branch into "BROKEN" linux-next
> (next-20110414)?
>
> Again, with which RCU/HZ/PREEMPT kernel-config options shall I test?
> This is from my yesterday's linux-next:
>
> # egrep 'RCU|_HZ |PREEMPT' /boot/config-2.6.39-rc4-next20110420.4-686-small
> # RCU Subsystem
> CONFIG_TREE_RCU=y
> # CONFIG_PREEMPT_RCU is not set
> CONFIG_RCU_TRACE=y
> CONFIG_RCU_FANOUT=32
> # CONFIG_RCU_FANOUT_EXACT is not set
> CONFIG_RCU_FAST_NO_HZ=y
> CONFIG_TREE_RCU_TRACE=y
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set
> # CONFIG_SPARSE_RCU_POINTER is not set
> CONFIG_RCU_TORTURE_TEST=m
> CONFIG_RCU_CPU_STALL_TIMEOUT=60
>
> Regards,
> - Sedat -
>
Looks like you want me to test with RCU_BOOST and RCU_TORTURE_TEST :-).
Attached is collectdebugfs-dileks.log, my current kernel-config and a
build-script to generate Debian packages.
$ LANG=C ./collectdebugfs.sh 2>&1 | tee collectdebugfs-dileks.log
I will do a 2nd run with PREEMPT_RCU enabled.
- Sedat -
View attachment "collectdebugfs-dileks.log" of type "text/x-log" (1015 bytes)
Download attachment "config-2.6.39-rc3-rcu-sedat.2011.04.19a+" of type "application/octet-stream" (87796 bytes)
Download attachment "build_linux-2.6-rcu.sh" of type "application/x-sh" (577 bytes)
Powered by blists - more mailing lists