[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTimEWqBVdq8MWbjAAfXCLPQdiHkuNg@mail.gmail.com>
Date: Thu, 21 Apr 2011 14:49:37 +0200
From: Sedat Dilek <sedat.dilek@...glemail.com>
To: paulmck@...ux.vnet.ibm.com
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
linux-next@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?)
On Thu, Apr 21, 2011 at 12:24 PM, Sedat Dilek
<sedat.dilek@...glemail.com> wrote:
> On Thu, Apr 21, 2011 at 11:07 AM, Sedat Dilek
> <sedat.dilek@...glemail.com> wrote:
>> On Thu, Apr 21, 2011 at 7:08 AM, Paul E. McKenney
>> <paulmck@...ux.vnet.ibm.com> wrote:
>>> On Thu, Apr 14, 2011 at 03:44:11PM -0700, Paul E. McKenney wrote:
>>>> On Fri, Apr 15, 2011 at 12:19:34AM +0200, Sedat Dilek wrote:
>>>> > On Thu, Apr 14, 2011 at 12:19 PM, Sedat Dilek
>>>> > <sedat.dilek@...glemail.com> wrote:
>>>> > > On Thu, Apr 14, 2011 at 11:16 AM, Sedat Dilek
>>>> > > <sedat.dilek@...glemail.com> wrote:
>>>> > >> [ Adding CC to RCU maintainer (Hi Paul :-)) ]
>>>> > >>
>>>> > >> Helping me for now with (see also Documentation/RCU/stallwarn.txt):
>>>> > >>
>>>> > >> # cat /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>>> > >> 0
>>>> > >>
>>>> > >> # echo "1" > /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>>> > >>
>>>> > >> # cat /sys/module/rcutree/parameters/rcu_cpu_stall_suppress
>>>> > >> 1
>>>> > >>
>>>> > >> - Sedat -
>>>> > >>
>>>> > >
>>>> > > That workaround helped till a system-freeze when generating a tarball
>>>> > > from my current kernel-tree.
>>>> > > I switched back to my yesterday's linux-next kernel.
>>>> > >
>>>> > > - Sedat -
>>>> > >
>>>> >
>>>> > I isolated the culprit so far:
>>>> >
>>>> > commit 900507fc62d5ba0164c07878dbc36ac97866a858
>>>> > "rcu: move TREE_RCU from softirq to kthread"
>>>> >
>>>> > With this revert my system does not show the symptoms I have reported.
>>>>
>>>> Hmmm... I never was able to reproduce this, but did find a workload
>>>> that slowed up the grace periods. I fixed that (which turned out to
>>>> be a wakeup problem), but my hopes that it would also fix your problem
>>>> were clearly unfounded. I have once again stopped exporting this commit
>>>> to -next.
>>>
>>> I have added some debug tracing, which are available at branch
>>> "sedat.2011.04.19a" in the git repository at:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
>>>
>>> Alternatively, if it is easier, the shown below can be used. FWIW,
>>> this patch is against 2.6.39-rc3.
>>>
>>> Either way, if you get a chance to run your tests on this, could you
>>> please run the attached script (collectdebugfs.sh) and capture its output?
>>> Sample output is attached as well (collectdebugfs.sh.out): the script
>>> should output something vaguely like the sample output every 15 seconds
>>> or so.
>>>
>>> The script assumes that debugfs is enabled (along with CONFIG_RCU_TRACE=y)
>>> and mounted as follows:
>>>
>>> mount -t debugfs none /sys/kernel/debug/
>>>
>>> Or if you mount debugfs somewhere else, please set the script's DEBUGFS_MP
>>> variable accordingly.
>>>
>>> Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>
>> Welcome to operation "Kill that RCU brainbug" (Starship troopers part X)!
>>
>> Of course I can help with testing.
>>
>> Paul, did you see recent RCU-related fixes to fs between rc3 and rc4?
>>
>> commit c1530019e311c91d14b24d8e74d233152d806e45
>> vfs: Fix absolute RCU path walk failures due to uninitialized seq number
>>
>> fff3e5ade4455a4b42a19c95dd7a167a3cb7956a
>> fs: synchronize_rcu when unregister_filesystem success not failure
>>
>> IIRC, Jens has pending block/plugging patches in his for-linus tree.
>> Especially this one (CONFIG_PREEMPT):
>>
>> 5f45c69589b7d2953584e6cd0b31e35dbe960ad0
>> cfq-iosched: read_lock() does not always imply rcu_read_lock()
>>
>> Some questions to test-scenario:
>>
>> Shall I test from linux-2.6-rcu.git#sedat.2011.04.19a GIT tree?
>> I think that's the ideal solution.
>> Or shall I pull sedat.2011.04.19a GIT branch into "BROKEN" linux-next
>> (next-20110414)?
>>
>> Again, with which RCU/HZ/PREEMPT kernel-config options shall I test?
>> This is from my yesterday's linux-next:
>>
>> # egrep 'RCU|_HZ |PREEMPT' /boot/config-2.6.39-rc4-next20110420.4-686-small
>> # RCU Subsystem
>> CONFIG_TREE_RCU=y
>> # CONFIG_PREEMPT_RCU is not set
>> CONFIG_RCU_TRACE=y
>> CONFIG_RCU_FANOUT=32
>> # CONFIG_RCU_FANOUT_EXACT is not set
>> CONFIG_RCU_FAST_NO_HZ=y
>> CONFIG_TREE_RCU_TRACE=y
>> # CONFIG_PREEMPT_NONE is not set
>> CONFIG_PREEMPT_VOLUNTARY=y
>> # CONFIG_PREEMPT is not set
>> # CONFIG_SPARSE_RCU_POINTER is not set
>> CONFIG_RCU_TORTURE_TEST=m
>> CONFIG_RCU_CPU_STALL_TIMEOUT=60
>>
>> Regards,
>> - Sedat -
>>
>
> Looks like you want me to test with RCU_BOOST and RCU_TORTURE_TEST :-).
>
> Attached is collectdebugfs-dileks.log, my current kernel-config and a
> build-script to generate Debian packages.
>
> $ LANG=C ./collectdebugfs.sh 2>&1 | tee collectdebugfs-dileks.log
>
> I will do a 2nd run with PREEMPT_RCU enabled.
>
> - Sedat -
>
Here the results from the 2nd-run (PREEMPT_RCU enabled).
- Sedat -
View attachment "collectdebugfs-dileks_preempt-rcu.log" of type "text/x-log" (33753 bytes)
Download attachment "config-2.6.39-rc3-preempt-rcu-sedat.2011.04.19a+" of type "application/octet-stream" (87936 bytes)
Download attachment "build_linux-2.6-rcu_v2.sh" of type "application/x-sh" (740 bytes)
Powered by blists - more mailing lists