linux-kernel - Re: [rcu_sched stall] regression/miss-config ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160516120329.GB3528@linux.vnet.ibm.com>
Date:	Mon, 16 May 2016 05:03:29 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	"santosh.shilimkar@...cle.com" <santosh.shilimkar@...cle.com>
Cc:	linux-kernel@...r.kernel.org, Sasha Levin <sasha.levin@...cle.com>
Subject: Re: [rcu_sched stall] regression/miss-config ?

On Sun, May 15, 2016 at 09:35:40PM -0700, santosh.shilimkar@...cle.com wrote:
> On 5/15/16 2:18 PM, Santosh Shilimkar wrote:
> >Hi Paul,
> >
> >I was asking Sasha about [1] since other folks in Oracle
> >also stumbled upon similar RCU stalls with v4.1 kernel in
> >different workloads. I was reported similar issue with
> >RDS as well and looking at [1], [2], [3] and [4], thought
> >of reaching out to see if you can help us to understand
> >this issue better.
> >
> >Have also included RCU specific config used in these
> >test(s). Its very hard to reproduce the issue but one of
> >the data point is, it reproduces on systems with larger
> >CPUs(64+). Same workload with less than 64 CPUs, don't
> >show the issue. Someone also told me, making use of
> >SLAB instead SLUB allocator makes difference but I
> >haven't verified that part for RDS.
> >
> >Let me know your thoughts. Thanks in advance !!
> >
> One of my colleague told me the pastebin server I used
> is Oracle internal only so adding the relevant logs along
> with email.
> 
> Regards,
> Santosh
> >
> >[1] https://lkml.org/lkml/2014/12/14/304
> 
> 
> [2]  Log 1 snippet:
> -----------------------------------------------------------------
>  INFO: rcu_sched self-detected stall on CPU
>  INFO: rcu_sched self-detected stall on CPU { 54}  (t=60000 jiffies
> g=66023 c=66022 q=0)
>  Task dump for CPU 54:
>  ksoftirqd/54    R  running task        0   389      2 0x00000008
>   0000000000000007 ffff88ff7f403d38 ffffffff810a8621 0000000000000036
>   ffffffff81ab6540 ffff88ff7f403d58 ffffffff810a86cf 0000000000000086
>   ffffffff81ab6940 ffff88ff7f403d88 ffffffff810e3ad3 ffffffff81ab6540
>  Call Trace:
>   <IRQ>  [<ffffffff810a8621>] sched_show_task+0xb1/0x120
>   [<ffffffff810a86cf>] dump_cpu_task+0x3f/0x50
>   [<ffffffff810e3ad3>] rcu_dump_cpu_stacks+0x83/0xc0
>   [<ffffffff810e490c>] print_cpu_stall+0xfc/0x170
>   [<ffffffff810e5eeb>] __rcu_pending+0x2bb/0x2c0
>   [<ffffffff810e5f8d>] rcu_check_callbacks+0x9d/0x170
>   [<ffffffff810e9772>] update_process_times+0x42/0x70
>   [<ffffffff810fb589>] tick_sched_handle+0x39/0x80
>   [<ffffffff810fb824>] tick_sched_timer+0x44/0x80
>   [<ffffffff810ebc04>] __run_hrtimer+0x74/0x1d0
>   [<ffffffff810fb7e0>] ? tick_nohz_handler+0xa0/0xa0
>   [<ffffffff810ebf92>] hrtimer_interrupt+0x102/0x240
>   [<ffffffff810521f9>] local_apic_timer_interrupt+0x39/0x60
>   [<ffffffff816c47b5>] smp_apic_timer_interrupt+0x45/0x59
>   [<ffffffff816c263e>] apic_timer_interrupt+0x6e/0x80
>   <EOI>  [<ffffffff8118db64>] ? free_one_page+0x164/0x380
>   [<ffffffff8118de43>] ? __free_pages_ok+0xc3/0xe0
>   [<ffffffff8118e775>] __free_pages+0x25/0x40
>   [<ffffffffa21054f0>] rds_message_purge+0x60/0x150 [rds]
>   [<ffffffffa2105624>] rds_message_put+0x44/0x80 [rds]
>   [<ffffffffa21535b4>] rds_ib_send_cqe_handler+0x134/0x2d0 [rds_rdma]
>   [<ffffffff816c102b>] ? _raw_spin_unlock_irqrestore+0x1b/0x50
>   [<ffffffffa18c0273>] ? mlx4_ib_poll_cq+0xb3/0x2a0 [mlx4_ib]
>   [<ffffffffa214c6f1>] poll_cq+0xa1/0xe0 [rds_rdma]
>   [<ffffffffa214d489>] rds_ib_tasklet_fn_send+0x79/0xf0 [rds_rdma]

The most likely possibility is that there is a 60-second-long loop in
one of the above functions.  This is within bottom-half execution, so
unfortunately the usual trick of placing cond_resched_rcu_qs() within this
loop, but outside of any RCU read-side critical section does not work.

Therefore, if there really is a loop here, one fix would be to
periodically unwind back out to run_ksoftirqd(), but setting up so that
the work would be continued later.  Another fix might be to move this
from tasklet context to workqueue context, where cond_resched_rcu_qs()
can be used -- however, this looks a bit like networking code, which
does not always take kindly to being run in process context (though
careful use of local_bh_disable() and local_bh_enable() can sometimes
overcome this issue).  A third fix, which works only if this code does
not use RCU and does not invoke any code that does use RCU, is to tell
RCU that it should ignore this code (which will require a little work
on RCU, as it currently does not tolerate this sort of thing aside from
the idle threads).  In this last approach, event-tracing calls must use
the _nonidle suffix.

I am not familiar with the RDS code, so I cannot be more specific.
That said, given the copyrights on those files, I bet you have a
colleague or two who understands the code quite deeply.  ;-)

							Thanx, Paul

>   [<ffffffff81084b31>] tasklet_action+0xb1/0xc0
>   [<ffffffff810854a2>] __do_softirq+0xf2/0x2f0
>   [<ffffffff810a4930>] ? smpboot_create_threads+0x80/0x80
>   [<ffffffff810856cd>] run_ksoftirqd+0x2d/0x50
>   [<ffffffff810a4a46>] smpboot_thread_fn+0x116/0x170
>   [<ffffffff810a11ae>] kthread+0xce/0xf0
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>   [<ffffffff816c1b62>] ret_from_fork+0x42/0x70
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>  Task dump for CPU 66:
>  ksoftirqd/66    R  running task        0   474      2 0x00000008
>   04208040664afe58 ffff88ff664ac008 000000010005d97a 0000000a39b2e200
>   0000000000000018 0000000000000006 ffff88ff664a1c00 ffff88ff669a5260
>   ffff88ff664a1c00 ffffffff81aa4ec0 ffffffff810a4930 0000000000000000
>  Call Trace:
>   [<ffffffff810a4930>] ? smpboot_create_threads+0x80/0x80
>   [<ffffffff810856cd>] ? run_ksoftirqd+0x2d/0x50
>   [<ffffffff810a4a46>] ? smpboot_thread_fn+0x116/0x170
>   [<ffffffff810a11ae>] ? kthread+0xce/0xf0
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>   [<ffffffff816c1b62>] ? ret_from_fork+0x42/0x70
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>   { 66}  (t=60011 jiffies g=66023 c=66022 q=0)
>  Task dump for CPU 54:
>  ksoftirqd/54    R  running task        0   389      2 0x00000008
>   04208040669e3e58 ffff88ff669e0008 000000010005d981 0000000a39b28e00
>   0000000000000000 0000000000000006 ffff88ff669d1c00 ffff88ff669a5020
>   ffff88ff669d1c00 ffffffff81aa4ec0 ffffffff810a4930 0000000000000000
>  Call Trace:
>   [<ffffffff810a4930>] ? smpboot_create_threads+0x80/0x80
>   [<ffffffff810856cd>] ? run_ksoftirqd+0x2d/0x50
>   [<ffffffff810a4a46>] ? smpboot_thread_fn+0x116/0x170
>   [<ffffffff810a11ae>] ? kthread+0xce/0xf0
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>   [<ffffffff816c1b62>] ? ret_from_fork+0x42/0x70
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>  Task dump for CPU 66:
>  ksoftirqd/66    R  running task        0   474      2 0x00000008
>   0000000000000003 ffff88ff7f703d38 ffffffff810a8621 0000000000000042
>   ffffffff81ab6540 ffff88ff7f703d58 ffffffff810a86cf 0000000000000086
>   ffffffff81ab6a40 ffff88ff7f703d88 ffffffff810e3ad3 ffffffff81ab6540
>  Call Trace:
>   <IRQ>  [<ffffffff810a8621>] sched_show_task+0xb1/0x120
>   [<ffffffff810a86cf>] dump_cpu_task+0x3f/0x50
>   [<ffffffff810e3ad3>] rcu_dump_cpu_stacks+0x83/0xc0
>   [<ffffffff810e490c>] print_cpu_stall+0xfc/0x170
>   [<ffffffff810e5eeb>] __rcu_pending+0x2bb/0x2c0
>   [<ffffffff810e5f8d>] rcu_check_callbacks+0x9d/0x170
>   [<ffffffff810e9772>] update_process_times+0x42/0x70
>   [<ffffffff810fb589>] tick_sched_handle+0x39/0x80
>   [<ffffffff810fb824>] tick_sched_timer+0x44/0x80
>   [<ffffffff810ebc04>] __run_hrtimer+0x74/0x1d0
>   [<ffffffff810fb7e0>] ? tick_nohz_handler+0xa0/0xa0
>   [<ffffffff810ebf92>] hrtimer_interrupt+0x102/0x240
>   [<ffffffff810521f9>] local_apic_timer_interrupt+0x39/0x60
>   [<ffffffff816c47b5>] smp_apic_timer_interrupt+0x45/0x59
>   [<ffffffff816c263e>] apic_timer_interrupt+0x6e/0x80
>   <EOI>  [<ffffffff8118db64>] ? free_one_page+0x164/0x380
>   [<ffffffff8118de43>] ? __free_pages_ok+0xc3/0xe0
>   [<ffffffff8118e775>] __free_pages+0x25/0x40
>   [<ffffffffa21054f0>] rds_message_purge+0x60/0x150 [rds]
>   [<ffffffffa2105624>] rds_message_put+0x44/0x80 [rds]
>   [<ffffffffa21535b4>] rds_ib_send_cqe_handler+0x134/0x2d0 [rds_rdma]
>   [<ffffffff816c102b>] ? _raw_spin_unlock_irqrestore+0x1b/0x50
>   [<ffffffffa18c0273>] ? mlx4_ib_poll_cq+0xb3/0x2a0 [mlx4_ib]
>   [<ffffffffa214c6f1>] poll_cq+0xa1/0xe0 [rds_rdma]
>   [<ffffffffa214d489>] rds_ib_tasklet_fn_send+0x79/0xf0 [rds_rdma]
>   [<ffffffff81084b31>] tasklet_action+0xb1/0xc0
>   [<ffffffff810854a2>] __do_softirq+0xf2/0x2f0
>   [<ffffffff810a4930>] ? smpboot_create_threads+0x80/0x80
>   [<ffffffff810856cd>] run_ksoftirqd+0x2d/0x50
>   [<ffffffff810a4a46>] smpboot_thread_fn+0x116/0x170
>   [<ffffffff810a11ae>] kthread+0xce/0xf0
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
>   [<ffffffff816c1b62>] ret_from_fork+0x42/0x70
>   [<ffffffff810a10e0>] ? kthread_freezable_should_stop+0x70/0x70
> -----------------------------------------------------------------
> 
> [3] log 2 snippet
> -------------------------------------------------------------------
>  INFO: rcu_sched self-detected stall on CPU
>  INFO: rcu_sched detected stalls on CPUs/tasks: { 77} (detected by
> 207, t=60002 jiffies, g=40822, c=40821, q=0)
>  Task dump for CPU 77:
>  rds-stress      R  running task        0  2930   2867 0x0000008a
>   0000000000000018 0000000000000286 ffff893e00000000 ffffffff810ec828
>   ffff883f5628d160 0000000000000282 ffff89407ffda440 00000000f4d139b8
>   0000000100000000 ffff89407ffdae08 0000000000000000 000000004dcc9f40
>  Call Trace:
>   [<ffffffff810ec828>] ? hrtimer_try_to_cancel+0x48/0xe0
>   [<ffffffff8118d7b3>] ? __alloc_pages_nodemask+0x193/0x2b0
>   [<ffffffff810c52ac>] ? remove_wait_queue+0x3c/0x50
>   [<ffffffff811d747f>] ? alloc_pages_current+0xaf/0x170
>   [<ffffffffa2135298>] ? rds_page_remainder_alloc+0x48/0x2a0 [rds]
>   [<ffffffffa212f06d>] ? rds_message_copy_from_user+0x6d/0x130 [rds]
>   [<ffffffffa2133040>] ? rds_sendmsg+0x470/0x990 [rds]
>   [<ffffffff811ffa00>] ? rw_copy_check_uvector+0xa0/0x130
>   [<ffffffff815c9ffd>] ? sock_sendmsg+0x4d/0x60
>   [<ffffffff815cc93a>] ? ___sys_sendmsg+0x30a/0x330
>   [<ffffffff811b77fc>] ? handle_pte_fault+0x20c/0x230
>   [<ffffffff815ccb69>] ? __sys_sendmsg+0x49/0x90
>   [<ffffffff81023611>] ? syscall_trace_leave+0xf1/0x140
>   [<ffffffff815ccbc9>] ? SyS_sendmsg+0x19/0x20
>   [<ffffffff816c176e>] ? system_call_fastpath+0x12/0x71
>   { 77}  (t=60011 jiffies g=40822 c=40821 q=0)
>  Task dump for CPU 77:
>  rds-stress      R  running task        0  2930   2867 0x0000008a
>   000000000000000e ffff893f7f5439c0 ffffffff810a8621 000000000000004d
>   ffffffff81ab6540 ffff893f7f5439e0 ffffffff810a86cf 0000000000000082
>   ffffffff81ab6a40 ffff893f7f543a10 ffffffff810e3ad3 ffffffff81ab6540
>  Call Trace:
>   <IRQ>  [<ffffffff810a8621>] sched_show_task+0xb1/0x120
>   [<ffffffff810a86cf>] dump_cpu_task+0x3f/0x50
>   [<ffffffff810e3ad3>] rcu_dump_cpu_stacks+0x83/0xc0
>   [<ffffffff810e490c>] print_cpu_stall+0xfc/0x170
>   [<ffffffff810e5eeb>] __rcu_pending+0x2bb/0x2c0
>   [<ffffffff810e5f8d>] rcu_check_callbacks+0x9d/0x170
>   [<ffffffff810e9772>] update_process_times+0x42/0x70
>   [<ffffffff810fb589>] tick_sched_handle+0x39/0x80
>   [<ffffffff810fb824>] tick_sched_timer+0x44/0x80
>   [<ffffffff810ebc04>] __run_hrtimer+0x74/0x1d0
>   [<ffffffff810fb7e0>] ? tick_nohz_handler+0xa0/0xa0
>   [<ffffffff811df679>] ? __free_slab+0xd9/0x1b0
>   [<ffffffff810ebf92>] hrtimer_interrupt+0x102/0x240
>   [<ffffffff810521f9>] local_apic_timer_interrupt+0x39/0x60
>   [<ffffffff816c47b5>] smp_apic_timer_interrupt+0x45/0x59
>   [<ffffffff816c263e>] apic_timer_interrupt+0x6e/0x80
>   [<ffffffff8118db64>] ? free_one_page+0x164/0x380
>   [<ffffffff8118de43>] ? __free_pages_ok+0xc3/0xe0
>   [<ffffffff8118e775>] __free_pages+0x25/0x40
>   [<ffffffffa212f4f0>] rds_message_purge+0x60/0x150 [rds]
>   [<ffffffffa212f624>] rds_message_put+0x44/0x80 [rds]
>   [<ffffffffa217d5b4>] rds_ib_send_cqe_handler+0x134/0x2d0 [rds_rdma]
>   [<ffffffff816c102b>] ? _raw_spin_unlock_irqrestore+0x1b/0x50
>   [<ffffffffa18fe273>] ? mlx4_ib_poll_cq+0xb3/0x2a0 [mlx4_ib]
>   [<ffffffffa21766f1>] poll_cq+0xa1/0xe0 [rds_rdma]
>   [<ffffffffa2177489>] rds_ib_tasklet_fn_send+0x79/0xf0 [rds_rdma]
>   [<ffffffff81084b31>] tasklet_action+0xb1/0xc0
>   [<ffffffff810854a2>] __do_softirq+0xf2/0x2f0
>   [<ffffffff810594f4>] ? native_apic_msr_eoi_write+0x14/0x20
>   [<ffffffff81085815>] irq_exit+0x125/0x130
>   [<ffffffff816c45f5>] do_IRQ+0x65/0x110
>   [<ffffffff816c22ae>] common_interrupt+0x6e/0x6e
>   <EOI>  [<ffffffff8118bf28>] ? get_page_from_freelist+0x2b8/0x750
>   [<ffffffff810ec828>] ? hrtimer_try_to_cancel+0x48/0xe0
>   [<ffffffff8118d7b3>] __alloc_pages_nodemask+0x193/0x2b0
>   [<ffffffff810c52ac>] ? remove_wait_queue+0x3c/0x50
>   [<ffffffff811d747f>] alloc_pages_current+0xaf/0x170
>   [<ffffffffa2135298>] rds_page_remainder_alloc+0x48/0x2a0 [rds]
>   [<ffffffffa212f06d>] rds_message_copy_from_user+0x6d/0x130 [rds]
>   [<ffffffffa2133040>] rds_sendmsg+0x470/0x990 [rds]
>   [<ffffffff811ffa00>] ? rw_copy_check_uvector+0xa0/0x130
>   [<ffffffff815c9ffd>] sock_sendmsg+0x4d/0x60
>   [<ffffffff815cc93a>] ___sys_sendmsg+0x30a/0x330
>   [<ffffffff811b77fc>] ? handle_pte_fault+0x20c/0x230
>   [<ffffffff815ccb69>] __sys_sendmsg+0x49/0x90
>   [<ffffffff81023611>] ? syscall_trace_leave+0xf1/0x140
>   [<ffffffff815ccbc9>] SyS_sendmsg+0x19/0x20
>   [<ffffffff816c176e>] system_call_fastpath+0x12/0x71
> ------------------------------------------------------
> 
> [5] LOG 3:
> ------------------------------------------------------
>  INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by
> 0, t=240007 jiffies, g=449043, c=449042, q=0)
>  All QSes seen, last rcu_sched kthread activity 240007
> (4299336825-4299096818), jiffies_till_next_fqs=3, root ->qsmask 0x0
>  ora_lms0_orcltw R  running task        0 22303      1
> 0x00000080
>   ffff8800c4651c00 00000000e7110062 ffff88010e803c18
> ffffffff810b038f
>   ffff88010e8185c0 ffffffff81b22c40 ffff88010e803c98
> ffffffff810e8f96
>   0000000000000000 ffffffff810b4deb ffff88010e817800
> ffff88010e8185c0
>  Call Trace:
>   <IRQ>  [<ffffffff810b038f>] sched_show_task+0xaf/0x120
>   [<ffffffff810e8f96>] rcu_check_callbacks+0x7e6/0x7f0
>   [<ffffffff810b4deb>] ? account_user_time+0x8b/0xa0
>   [<ffffffff810ee582>] update_process_times+0x42/0x70
>   [<ffffffff810fea65>] tick_sched_handle.isra.18+0x25/0x60
>   [<ffffffff810feae4>] tick_sched_timer+0x44/0x80
>   [<ffffffff810ef307>] __run_hrtimer+0x77/0x1d0
>   [<ffffffff810feaa0>] ? tick_sched_handle.isra.18+0x60/0x60
>   [<ffffffff810ef6e3>] hrtimer_interrupt+0x103/0x230
>   [<ffffffff8100a64d>] xen_timer_interrupt+0x3d/0x170
>   [<ffffffff814469a9>] ? add_interrupt_randomness+0x49/0x200
>   [<ffffffff810dd1ae>] handle_irq_event_percpu+0x3e/0x1a0
>   [<ffffffff810e0cfd>] handle_percpu_irq+0x3d/0x60
>   [<ffffffff810dc7ab>] generic_handle_irq+0x2b/0x40
>   [<ffffffff813fe91f>] evtchn_2l_handle_events+0x26f/0x280
>   [<ffffffff81085bba>] ? __do_softirq+0x18a/0x2d0
>   [<ffffffff813fbe8f>] __xen_evtchn_do_upcall+0x4f/0x90
>   [<ffffffff813fddf4>] xen_evtchn_do_upcall+0x34/0x50
>   [<ffffffff8170f35e>] xen_hvm_callback_vector+0x6e/0x80
>   <EOI>
>  rcu_sched kthread starved for 240007 jiffies!
>  INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by
> 0, t=420012 jiffies, g=449043, c=449042, q=0)
>  All QSes seen, last rcu_sched kthread activity 420012
> (4299516830-4299096818), jiffies_till_next_fqs=3, root ->qsmask 0x0
>  ora_lms0_orcltw R  running task        0 22303      1
> 0x00000080
>   ffff8800c4651c00 00000000e7110062 ffff88010e803c18
> ffffffff810b038f
>   ffff88010e8185c0 ffffffff81b22c40 ffff88010e803c98
> ffffffff810e8f96
>   0000000000000000 ffff88010e817800 0000000000017800
> ffff88010e8185c0
>  Call Trace:
>   <IRQ>  [<ffffffff810b038f>] sched_show_task+0xaf/0x120
>   [<ffffffff810e8f96>] rcu_check_callbacks+0x7e6/0x7f0
>   [<ffffffff810ee582>] update_process_times+0x42/0x70
>   [<ffffffff810fea65>] tick_sched_handle.isra.18+0x25/0x60
>   [<ffffffff810feae4>] tick_sched_timer+0x44/0x80
>   [<ffffffff810ef307>] __run_hrtimer+0x77/0x1d0
>   [<ffffffff810feaa0>] ? tick_sched_handle.isra.18+0x60/0x60
>   [<ffffffff810ef6e3>] hrtimer_interrupt+0x103/0x230
>   [<ffffffff8100a64d>] xen_timer_interrupt+0x3d/0x170
>   [<ffffffff814469a9>] ? add_interrupt_randomness+0x49/0x200
>   [<ffffffff810dd1ae>] handle_irq_event_percpu+0x3e/0x1a0
>   [<ffffffff810e0cfd>] handle_percpu_irq+0x3d/0x60
>   [<ffffffff810dc7ab>] generic_handle_irq+0x2b/0x40
>   [<ffffffff813fe91f>] evtchn_2l_handle_events+0x26f/0x280
>   [<ffffffff813fbe8f>] __xen_evtchn_do_upcall+0x4f/0x90
>   [<ffffffff813fddf4>] xen_evtchn_do_upcall+0x34/0x50
>   [<ffffffff8170f35e>] xen_hvm_callback_vector+0x6e/0x80
>   <EOI>  [<ffffffff81024847>] ?
> do_audit_syscall_entry+0x67/0x70
>   [<ffffffff81025eb3>] syscall_trace_enter_phase1+0x143/0x1a0
>   [<ffffffff8112ac86>] ? __audit_syscall_exit+0x1e6/0x280
>   [<ffffffff81026216>] ? syscall_trace_leave+0xc6/0x120
>   [<ffffffff8170d69a>] tracesys+0xd/0x44
>  rcu_sched kthread starved for 420012 jiffies!
>  INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by
> 0, t=600017 jiffies, g=449043, c=449042, q=0)
>  All QSes seen, last rcu_sched kthread activity 600017
> (4299696835-4299096818), jiffies_till_next_fqs=3, root ->qsmask 0x0
>  ora_lms0_orcltw R  running task        0 22303      1
> 0x00000080
>   ffff8800c4651c00 00000000e7110062 ffff88010e803c18
> ffffffff810b038f
>   ffff88010e8185c0 ffffffff81b22c40 ffff88010e803c98
> ffffffff810e8f96
>   0000000000000000 ffffffff810b4deb ffff88010e817800
> ffff88010e8185c0
> Call Trace:
>  <IRQ>  [<ffffffff810b038f>] sched_show_task+0xaf/0x120
>  [<ffffffff810e8f96>] rcu_check_callbacks+0x7e6/0x7f0
>  [<ffffffff810b4deb>] ? account_user_time+0x8b/0xa0
>  [<ffffffff810ee582>] update_process_times+0x42/0x70
>  [<ffffffff810fea65>] tick_sched_handle.isra.18+0x25/0x60
>  [<ffffffff810feae4>] tick_sched_timer+0x44/0x80
>  [<ffffffff810ef307>] __run_hrtimer+0x77/0x1d0
>  [<ffffffff810feaa0>] ? tick_sched_handle.isra.18+0x60/0x60
>  [<ffffffff810ef6e3>] hrtimer_interrupt+0x103/0x230
>  [<ffffffff8100a64d>] xen_timer_interrupt+0x3d/0x170
>  [<ffffffff814469a9>] ? add_interrupt_randomness+0x49/0x200
>  [<ffffffff810dd1ae>] handle_irq_event_percpu+0x3e/0x1a0
>  [<ffffffff810e0cfd>] handle_percpu_irq+0x3d/0x60
>  [<ffffffff810dc7ab>] generic_handle_irq+0x2b/0x40
>  [<ffffffff813fe91f>] evtchn_2l_handle_events+0x26f/0x280
>  [<ffffffff813fbe8f>] __xen_evtchn_do_upcall+0x4f/0x90
>  [<ffffffff813fddf4>] xen_evtchn_do_upcall+0x34/0x50
>  [<ffffffff8170f35e>] xen_hvm_callback_vector+0x6e/0x80
>  <EOI>
> rcu_sched kthread starved for 600017 jiffies!
> ------------------------------------------------------
> 
> [5] RCU config
> -----------------------------------------------
> # RCU Subsystem
> CONFIG_TREE_RCU=y
> CONFIG_SRCU=y
> # CONFIG_TASKS_RCU is not set
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_USER_QS=y
> CONFIG_RCU_FANOUT=64
> CONFIG_RCU_FANOUT_LEAF=16
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_RCU_FAST_NO_HZ is not set
> # CONFIG_TREE_RCU_TRACE is not set
> CONFIG_RCU_KTHREAD_PRIO=0
> CONFIG_RCU_NOCB_CPU=y
> # CONFIG_RCU_NOCB_CPU_NONE is not set
> # CONFIG_RCU_NOCB_CPU_ZERO is not set
> CONFIG_RCU_NOCB_CPU_ALL=y
> # CONFIG_RCU_EXPEDITE_BOOT is not set
> # RCU Debugging
> # CONFIG_PROVE_RCU is not set
> # CONFIG_SPARSE_RCU_POINTER is not set
> # CONFIG_RCU_TORTURE_TEST is not set
> CONFIG_RCU_CPU_STALL_TIMEOUT=60
> # CONFIG_RCU_CPU_STALL_INFO is not set
> # CONFIG_RCU_TRACE is not set
> --------------------------------------------------
>