linux-kernel - Re: rcu_preempt self-detected stall on CPU from 4.4-rc4, since 3.17

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160116003954.GH3818@linux.vnet.ibm.com>
Date:	Fri, 15 Jan 2016 16:39:54 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Ross Green <rgkernel@...il.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	jiangshanlai@...il.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, dhowells@...hat.com,
	Eric Dumazet <edumazet@...gle.com>, dvhart@...ux.intel.com,
	Frédéric Weisbecker <fweisbec@...il.com>,
	oleg@...hat.com, pranith kumar <bobby.prani@...il.com>
Subject: Re: rcu_preempt self-detected stall on CPU from 4.4-rc4, since 3.17

On Thu, Jan 14, 2016 at 04:39:31PM +1100, Ross Green wrote:
> Just for completeness I have tested the kernel 4.4 and have seen stall
> warnings similar to the others reported.
> 
> I have included the dmesg dump with stack trace for peoples further
> understanding.

Thank you!  Interestingly enough, although the dump looks similar, I
don't see the usual message about the RCU grace-period kthread being
starved.  So this might be something different.

Could you please try current -rcu, booting with rcutree.rcu_kick_kthreads?

							Thanx, Paul

> Regards,
> 
> Ross
> 
> On Sat, Jan 9, 2016 at 5:35 AM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
> > On Thu, Jan 07, 2016 at 10:10:54AM +1100, Ross Green wrote:
> >> Just for completeness I have attached another rcu_preempt stall
> >> warning from a kinux-3.17.1 kernel.
> >>
> >> So it looks like these stall warning go back that far.
> >>
> >> It could be that the kernel instrumentation improved enough to detect
> >> this situation around the 3.17 kernel. I can't find any earlier
> >> records of these stall warnings before these kernels. I guess someone
> >> must have suspected there was a potential for these stalls and hence
> >> the detection facility.
> >
> > Thank you for testing this, Ross!  For whatever it is worth, the
> > stall-detection code was added early on, back in the 2.6 days.
> >
> > On the workaround end of things, I am having limited success by forcing
> > RCU's grace-period kthread off of outgoing CPUs at down-prepare time.
> > I really don't like this workaround because I would rather give the
> > sysadm full control of where these kthreads run, but workarounds are
> > workarounds...
> >
> > See 3263d1f49276 (rcu: Migrate grace-period kthread off of outgoing CPU)
> > for the current version in -rcu.
> >
> >                                                         Thanx, Paul
> >
> >> Regards,
> >>
> >> Ross
> >>
> >> On Tue, Jan 5, 2016 at 1:21 AM, Ross Green <rgkernel@...il.com> wrote:
> >> > Well with the release of 4.4-rc8 I have built and
> >> >  started testing the kernel.
> >> >
> >> > With some luck I managed to get a rcu_preempt stall within a few hours
> >> > of testing.
> >> >
> >> > Upon booting, I ran a small series of bench marks to make sure
> >> > everything is running as expected. limited regression testing and then
> >> > just left the system to idle away, with periodic monitoring from the
> >> > network.
> >> >
> >> >
> >> > please find attached two stack traces from linux-4.4-rc8 and also linux-4.4-rc7.
> >> >
> >> > The interesting thing with the rc7 trace is that there are multiple
> >> > stalls that have occurred over a 6 day period.
> >> >
> >> > I realise Paul you have a number of changes pending for the the RCU
> >> > code. It would be good to try and establish what is happening with
> >> > these stalls before the impact of those changes given that timings and
> >> > dynamics might change whats happening in the current environment.
> >> >
> >> > As reported earlier, i have never been able to induce these stalls
> >> > with heavy loading of the system. The only method I can be sure of is
> >> > to leave the system quiet and let time go by till a stall occurs.
> >> >
> >> > Regards,
> >> >
> >> > Ross
> >> >
> >> > On Mon, Jan 4, 2016 at 9:00 AM, Ross Green <rgkernel@...il.com> wrote:
> >> >> Thanks Paul for your analysis and investigation,
> >> >>
> >> >> I understand your patches are designed not to "fix" the problem, but
> >> >> more to move the problem ahead.
> >> >>
> >> >> I will include a few more stack traces from various kernels. I can go
> >> >> back to around 3.17, with similar trace results.
> >> >>
> >> >> My observation is that the problem can occur at various times and with
> >> >> out any "bad" effect other than more stalls could happen afterwards.
> >> >>
> >> >> At first I wondered if they might actually be a false positive as the
> >> >> kernel seemed to carry on and run quite happily.  It is rare that I
> >> >> find a kernel just locks up after observing such a stall, or a
> >> >> complete kernel splat!
> >> >>
> >> >> Unfortunately with my testing I have never been able to induce the
> >> >> problem under any heavy load that would immediately trigger the
> >> >> problem. Indeed most heavy cpu utilisation seemed to just sail on
> >> >> quite nicely.
> >> >>
> >> >> The time for a fault, seems so far, to be non-deterministic with
> >> >> quiescent systems taking anywhere from a few hours through to some six
> >> >> days before showing the problem.
> >> >>
> >> >> More recent kernels seem to have richer stall and back trace
> >> >> information so I was hoping that might shed some light on how they
> >> >> might be occurring..
> >> >>
> >> >> I usually only run a kernel till I get a new -rc release to test with,
> >> >> so a run of 1 week is a typical cycle.
> >> >>
> >> >> I just wish I could find a sure fire method to trigger the problem!!
> >> >>
> >> >> I have included a few more traces of various kernels all showing the problem.
> >> >>
> >> >> Regards,
> >> >>
> >> >> Ross
> >> >>
> >> >> On Mon, Jan 4, 2016 at 5:15 AM, Paul E. McKenney
> >> >> <paulmck@...ux.vnet.ibm.com> wrote:
> >> >>> On Sun, Jan 03, 2016 at 07:27:17PM +1100, Ross Green wrote:
> >> >>>> I would not describe the load on this test machine as high or real time.
> >> >>>>
> >> >>>> Apart from a number of standard daemons not much more is running at all!
> >> >>>>
> >> >>>> I normally build a release kernel as soon as possible and set it running.
> >> >>>> Typically I run a series of benchmarks to confirm most things appear
> >> >>>> to be working and then just leave it running. During a normal day i
> >> >>>> will check on the machine 4/5 times just to see how its going!
> >> >>>> Typically I will logon remotely via wifi network connection.
> >> >>>>
> >> >>>> just for your information i will include a few other stack traces from
> >> >>>> previous kernels that may show some trend!
> >> >>>>
> >> >>>>
> >> >>>> Please see the attachments.
> >> >>>
> >> >>> Thank you for the additional details.  This does look similar to some
> >> >>> problems I am seeing, though only in heavy rcutorture workloads with
> >> >>> CPU hotplugging.
> >> >>>
> >> >>> I have some crude workarounds in progress, see for example
> >> >>> 2da26818e515 (rcu: Awaken grace-period kthread when stalled) at
> >> >>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git.
> >> >>> This workaround kicks the RCU grace-period kthread on every stall warning.
> >> >>> In my testing, this workaround results in slow but real forward progress.
> >> >>>
> >> >>> I have a better workaround in progress, however, please note:
> >> >>>
> >> >>> 1.      I have no intention of sending these workarounds upstream.
> >> >>>
> >> >>> 2.      The workarounds will splat when they take effect.  In other words,
> >> >>>         the idea is not to paper over the problem, but instead to allow
> >> >>>         me to separate testing concerns.
> >> >>>
> >> >>> 3.      A fix is needed for the underlying bug, wherever it might be.
> >> >>>
> >> >>>                                                         Thanx, Paul
> >> >>>
> >> >>>> Regards,
> >> >>>>
> >> >>>> Ross
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Sun, Jan 3, 2016 at 5:17 PM, Paul E. McKenney
> >> >>>> <paulmck@...ux.vnet.ibm.com> wrote:
> >> >>>> > On Sun, Jan 03, 2016 at 04:29:11PM +1100, Ross Green wrote:
> >> >>>> >> Still seeing these rcu_preempt stalls on kernels through to 4.4-rc7
> >> >>>> >>
> >> >>>> >> Still have not found a sure fire method to evoke this stall, but have
> >> >>>> >> found that it will normally occur within a week of running a kernel -
> >> >>>> >> usually when it is quiet with light load.
> >> >>>> >>
> >> >>>> >> Have seen similar self detected stalls all the way back to 3.17.
> >> >>>> >> Most recent kernels have included 4.4-rc5 4.4-rc6 and 4.4-rc7
> >> >>>> >>
> >> >>>> >> Regards,
> >> >>>> >>
> >> >>>> >> Ross
> >> >>>> >>
> >> >>>> >> On Fri, Dec 11, 2015 at 10:17 PM, Ross Green <rgkernel@...il.com> wrote:
> >> >>>> >> > I have been getting these stalls in kernels going back to 3.17.
> >> >>>> >> >
> >> >>>> >> > This stall occurs usually under light load but often requires several
> >> >>>> >> > days to show itself. I have not found any simple way to trigger the
> >> >>>> >> > stall. Indeed heavy workloads seems not to show the fault.
> >> >>>> >> >
> >> >>>> >> > Anyone have any thoughts here?
> >> >>>> >> >
> >> >>>> >> > The recent patch by peterz with kernel/sched/wait.c I thought might
> >> >>>> >> > help the situation, but alas after a few days of running 4.4-rc4 the
> >> >>>> >> > following turned up.
> >> >>>> >> >
> >> >>>> >> > [179922.003570] INFO: rcu_preempt self-detected stall on CPU
> >> >>>> >> > [179922.008178] INFO: rcu_preempt detected stalls on CPUs/tasks:
> >> >>>> >> > [179922.008178]         0-...: (1 ticks this GP) idle=a91/1/0
> >> >>>> >
> >> >>>> > CPU 0 is non-idle from an RCU perspective.
> >> >>>> >
> >> >>>> >> > softirq=1296733/1296733 fqs=0
> >> >>>> >> > [179922.008178]
> >> >>>> >> > [179922.008209] (detected by 1, t=8775 jiffies, g=576439, c=576438, q=102)
> >> >>>> >> > [179922.008209] Task dump for CPU 0:
> >> >>>> >> > [179922.008209] swapper/0       R [179922.008209]  running [179922.008209]     0     0      0 0x00000000
> >> >>>> >> > [179922.008209] Backtrace:
> >> >>>> >> >
> >> >>>> >> > [179922.008239] Backtrace aborted due to bad frame pointer <c0907f54>
> >> >>>> >
> >> >>>> > Can't have everything, I guess...
> >> >>>> >
> >> >>>> >> > [179922.008239] rcu_preempt kthread starved for 8775 jiffies! g576439 c576438 f0x0 s3 ->state=0x1
> >> >>>> >
> >> >>>> > Something is keeping the rcu_preempt grace-period kthread from
> >> >>>> > running.  This far into the grace period, it should have a
> >> >>>> > timer event waking it every few jiffies.  It is currently
> >> >>>> > in TASK_INTERRUPTIBLE state.
> >> >>>> >
> >> >>>> >> > [179922.060302]         0-...: (1 ticks this GP) idle=a91/1/0 softirq=1296733/1296733 fqs=0
> >> >>>> >> > [179922.068023]          (t=8775 jiffies g=576439 c=576438 q=102)
> >> >>>> >> > [179922.073913] rcu_preempt kthread starved for 8775 jiffies! g576439 c576438 f0x2 s3 ->state=0x100
> >> >>>> >
> >> >>>> > Same story, same grace period, pretty much same time.  Now there is an FQS
> >> >>>> > request (f0x2) and the state is now TASK_WAKING (->state=0x100 == 256).
> >> >>>> >
> >> >>>> >> > [179922.083587] Task dump for CPU 0:
> >> >>>> >> > [179922.087097] swapper/0       R running      0     0      0 0x00000000
> >> >>>> >> > [179922.093292] Backtrace:
> >> >>>> >> > [179922.096313] [<c0013ea8>] (dump_backtrace) from [<c00140a4>] (show_stack+0x18/0x1c)
> >> >>>> >> > [179922.104675]  r7:c0908514 r6:80080193 r5:00000000 r4:c090aca8
> >> >>>> >> > [179922.110809] [<c001408c>] (show_stack) from [<c005a858>] (sched_show_task+0xbc/0x110)
> >> >>>> >> > [179922.119049] [<c005a79c>] (sched_show_task) from [<c005ccd4>] (dump_cpu_task+0x40/0x44)
> >> >>>> >> > [179922.127624]  r5:c0917680 r4:00000000
> >> >>>> >> > [179922.131042] [<c005cc94>] (dump_cpu_task) from [<c0082268>] (rcu_dump_cpu_stacks+0x9c/0xdc)
> >> >>>> >> > [179922.140350]  r5:c0917680 r4:00000001
> >> >>>> >> > [179922.143157] [<c00821cc>] (rcu_dump_cpu_stacks) from [<c008637c>] (rcu_check_callbacks+0x504/0x8e4)
> >> >>>> >> > [179922.153808]  r9:c0908514 r8:c0917680 r7:00000066 r6:2eeab000
> >> >>>> >> > r5:c0904300 r4:ef7af300
> >> >>>> >> > [179922.161499] [<c0085e78>] (rcu_check_callbacks) from [<c00895d0>] (update_process_times+0x40/0x6c)
> >> >>>> >> > [179922.170898]  r10:c009a584 r9:00000001 r8:ef7abc4c r7:0000a3a3
> >> >>>> >> > r6:4ec3391c r5:00000000
> >> >>>> >> > [179922.179901]  r4:c090aca8
> >> >>>> >> > [179922.182708] [<c0089590>] (update_process_times) from [<c009a580>]
> >> >>>> >> > (tick_sched_handle+0x50/0x54)
> >> >>>> >> > [179922.192108]  r5:c0907f10 r4:ef7abe40
> >> >>>> >> > [179922.195983] [<c009a530>] (tick_sched_handle) from [<c009a5d4>]
> >> >>>> >> > (tick_sched_timer+0x50/0x94)
> >> >>>> >> > [179922.204895] [<c009a584>] (tick_sched_timer) from [<c0089fe4>]
> >> >>>> >> > (__hrtimer_run_queues+0x110/0x1a0)
> >> >>>> >> > [179922.214324]  r7:00000000 r6:ef7abc40 r5:ef7abe40 r4:ef7abc00
> >> >>>> >> > [179922.220428] [<c0089ed4>] (__hrtimer_run_queues) from [<c008a674>]
> >> >>>> >> > (hrtimer_interrupt+0xac/0x1f8)
> >> >>>> >> > [179922.227111]  r10:ef7abc78 r9:ef7abc98 r8:ef7abc14 r7:ef7abcb8
> >> >>>> >> > r6:ffffffff r5:00000003
> >> >>>> >> > [179922.238220]  r4:ef7abc00
> >> >>>> >> > [179922.238220] [<c008a5c8>] (hrtimer_interrupt) from [<c00170ec>]
> >> >>>> >> > (twd_handler+0x38/0x48)
> >> >>>> >> > [179922.238220]  r10:c09084e8 r9:fa241100 r8:00000011 r7:ef028780
> >> >>>> >> > r6:c092574c r5:ef005cc0
> >> >>>> >
> >> >>>> > All interrupt stack up to this point.
> >> >>>> >
> >> >>>> > It is quite possible that the stuff below here is at fault as well.
> >> >>>> > That said, the grace-period should actually get to execute at some
> >> >>>> > point.  Do you have a heavy real-time load that might be starving
> >> >>>> > things?
> >> >>>> >
> >> >>>> >                                                         Thanx, Paul
> >> >>>> >
> >> >>>> >> > [179922.257110]  r4:00000001
> >> >>>> >> > [179922.257110] [<c00170b4>] (twd_handler) from [<c007c8f8>] (handle_percpu_devid_irq+0x74/0x8c)
> >> >>>> >> > [179922.269683]  r5:ef005cc0 r4:ef7b1740
> >> >>>> >> > [179922.269683] [<c007c884>] (handle_percpu_devid_irq) from [<c0078454>] (generic_handle_irq+0x2c/0x3c)
> >> >>>> >> > [179922.283233]  r9:fa241100 r8:ef008000 r7:00000001 r6:00000000
> >> >>>> >> > r5:00000000 r4:c09013e8
> >> >>>> >> > [179922.290985] [<c0078428>] (generic_handle_irq) from [<c007872c>] (__handle_domain_irq+0x64/0xbc)
> >> >>>> >> > [179922.300842] [<c00786c8>] (__handle_domain_irq) from [<c00094c0>]
> >> >>>> >> > (gic_handle_irq+0x50/0x90)
> >> >>>> >> > [179922.303222]  r9:fa241100 r8:fa240100 r7:c0907f10 r6:fa24010c
> >> >>>> >> > r5:c09087a8 r4:c0925748
> >> >>>> >> > [179922.315216] [<c0009470>] (gic_handle_irq) from [<c0014bd4>]
> >> >>>> >> > (__irq_svc+0x54/0x90)
> >> >>>> >> > [179922.319000] Exception stack(0xc0907f10 to 0xc0907f58)
> >> >>>> >> > [179922.331542] 7f00:                                     00000000
> >> >>>> >> > ef7ab390 fe600000 00000000
> >> >>>> >> > [179922.331542] 7f20: c0906000 c090849c c0900364 c06a8124 c0907f80
> >> >>>> >> > c0944563 c09084e8 c0907f6c
> >> >>>> >> > [179922.349029] 7f40: c0907f4c c0907f60 c00287ac c0010ba8 60080113 ffffffff
> >> >>>> >> > [179922.349029]  r9:c0944563 r8:c0907f80 r7:c0907f44 r6:ffffffff
> >> >>>> >> > r5:60080113 r4:c0010ba8
> >> >>>> >> > [179922.357116] [<c0010b80>] (arch_cpu_idle) from [<c006f034>]
> >> >>>> >> > (default_idle_call+0x28/0x34)
> >> >>>> >> > [179922.368926] [<c006f00c>] (default_idle_call) from [<c006f154>]
> >> >>>> >> > (cpu_startup_entry+0x114/0x18c)
> >> >>>> >> > [179922.368926] [<c006f040>] (cpu_startup_entry) from [<c069fc6c>]
> >> >>>> >> > (rest_init+0x90/0x94)
> >> >>>> >> > [179922.385284]  r7:ffffffff r4:00000002
> >> >>>> >> > [179922.393463] [<c069fbdc>] (rest_init) from [<c08bbcec>]
> >> >>>> >> > (start_kernel+0x370/0x37c)
> >> >>>> >> > [179922.400421]  r5:c0947000 r4:00000000
> >> >>>> >> > [179922.400421] [<c08bb97c>] (start_kernel) from [<8000807c>] (0x8000807c)
> >> >>>> >> > $
> >> >>>> >
> >> >>>> >
> >> >>>
> >> >>>
> >> >>>
> >
> >