lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150803213723.GN27280@linux.vnet.ibm.com>
Date:	Mon, 3 Aug 2015 14:37:23 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Dave Jones <davej@...emonkey.org.uk>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Josh Triplett <josh@...htriplett.org>
Cc:	sasha.levin@...cle.com, fweisbec@...il.com
Subject: Re: 4.2-rc5 rcu stalls.

On Mon, Aug 03, 2015 at 05:08:35PM -0400, Dave Jones wrote:
>  Content analysis details:   (-2.9 points, 5.0 required)
> 
>   pts rule name              description
>  ---- ---------------------- --------------------------------------------------
>  -1.0 ALL_TRUSTED            Passed through trusted hosts only via SMTP
>  -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
>                              [score: 0.0000]
> X-Authenticated-User: davej@...emonkey.org.uk
> X-ZLA-Header: unknown; 0
> X-ZLA-DetailInfo: BA=6.00003574; NDR=6.00000001; ZLA=6.00000002; ZF=6.00000004; ZB=6.00041153; ZH=6.00102541; ZP=6.00079473; ZU=6.00000001; UDB=6.00244655; UTC=2015-08-03 21:08:45
> x-cbid: 15080321-8236-0000-0000-00000DB1C0C1
> X-IBM-ISS-SpamDetectors: Score=0.405233; BY=0; FL=0; FP=0; FZ=0; HX=0; KW=0;
>  PH=0; RB=0; SC=0.405233; ST=0; TS=0; UL=0; ISC=
> X-IBM-ISS-DetailInfo:  BY=3.00004248; HX=3.00000236; KW=3.00000007;
>  PH=3.00000003; SC=3.00000115; SDB=6.00568899; UDB=6.00244655; UTC=2015-08-03
>  21:08:47
> X-TM-AS-MML: disable
> 
> [ 2120.854974] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2120.855128] 	Tasks blocked on level-0 rcu_node (CPUs 0-3): P1497
> [ 2120.855263] 	(detected by 2, t=65002 jiffies, g=78835, c=78834, q=0)
> [ 2120.855403] trinity-watchdo R  running task    14336  1497   1496 0x00080000
> [ 2120.855563]  ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8804f58b51c0
> [ 2120.855728]  ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000
> [ 2120.855893]  0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8
> [ 2120.856062] Call Trace:
> [ 2120.856116]  [<ffffffffa17fa0b0>] ? preempt_schedule_irq+0x40/0xa0
> [ 2120.856252]  [<ffffffffa17fa0b6>] preempt_schedule_irq+0x46/0xa0

Sasha Levin (CCed) was having roughly similar stalls, and found that
reverting b30f0e3ffedf (sched/preempt: Optimize preemption operations
on __schedule() callers) made the stalls go away.  Does that help in
your case?

							Thanx, Paul

> [ 2120.856387]  [<ffffffffa1801129>] retint_kernel+0x1b/0x2d
> [ 2120.856507]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.856627]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.856752]  [<ffffffffa10d675e>] ? lock_release+0x33e/0x700
> [ 2120.856878]  [<ffffffffa108be29>] ? group_send_sig_info+0x79/0x120
> [ 2120.857015]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.857137]  [<ffffffffa108c07b>] ? kill_pid_info+0x9b/0x150
> [ 2120.857259]  [<ffffffffa108c098>] kill_pid_info+0xb8/0x150
> [ 2120.857380]  [<ffffffffa108c000>] ? kill_pid_info+0x20/0x150
> [ 2120.857508]  [<ffffffffa108c244>] SYSC_kill+0xf4/0x2b0
> [ 2120.857620]  [<ffffffffa108c1ed>] ? SYSC_kill+0x9d/0x2b0
> [ 2120.857736]  [<ffffffffa10d354b>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> [ 2120.857882]  [<ffffffffa10d35ed>] ? trace_hardirqs_on+0xd/0x10
> [ 2120.858012]  [<ffffffffa1192ed3>] ? context_tracking_user_exit+0x13/0x20
> [ 2120.858161]  [<ffffffffa1012db7>] ? syscall_trace_enter_phase1+0xf7/0x150
> [ 2120.858314]  [<ffffffffa1001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> [ 2120.858456]  [<ffffffffa108e64e>] SyS_kill+0xe/0x10
> [ 2120.858563]  [<ffffffffa1800597>] entry_SYSCALL_64_fastpath+0x12/0x6f
> [ 2120.858706] trinity-watchdo R  running task    14336  1497   1496 0x00080000
> [ 2120.858863]  ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8804f58b51c0
> [ 2120.859028]  ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000
> [ 2120.859194]  0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8
> [ 2120.859358] Call Trace:
> [ 2120.859410]  [<ffffffffa17fa0b0>] ? preempt_schedule_irq+0x40/0xa0
> [ 2120.859545]  [<ffffffffa17fa0b6>] preempt_schedule_irq+0x46/0xa0
> [ 2120.859676]  [<ffffffffa1801129>] retint_kernel+0x1b/0x2d
> [ 2120.859794]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.859916]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.860037]  [<ffffffffa10d675e>] ? lock_release+0x33e/0x700
> [ 2120.860163]  [<ffffffffa108be29>] ? group_send_sig_info+0x79/0x120
> [ 2120.860297]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2120.860422]  [<ffffffffa108c07b>] ? kill_pid_info+0x9b/0x150
> [ 2120.860548]  [<ffffffffa108c098>] kill_pid_info+0xb8/0x150
> [ 2120.860670]  [<ffffffffa108c000>] ? kill_pid_info+0x20/0x150
> [ 2120.860793]  [<ffffffffa108c244>] SYSC_kill+0xf4/0x2b0
> [ 2120.860906]  [<ffffffffa108c1ed>] ? SYSC_kill+0x9d/0x2b0
> [ 2120.861022]  [<ffffffffa10d354b>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> [ 2120.861172]  [<ffffffffa10d35ed>] ? trace_hardirqs_on+0xd/0x10
> [ 2120.861301]  [<ffffffffa1192ed3>] ? context_tracking_user_exit+0x13/0x20
> [ 2120.861449]  [<ffffffffa1012db7>] ? syscall_trace_enter_phase1+0xf7/0x150
> [ 2120.866641]  [<ffffffffa1001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> [ 2120.871850]  [<ffffffffa108e64e>] SyS_kill+0xe/0x10
> [ 2120.877038]  [<ffffffffa1800597>] entry_SYSCALL_64_fastpath+0x12/0x6f
> [ 2316.020180] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2316.025395] 	Tasks blocked on level-0 rcu_node (CPUs 0-3): P1497
> [ 2316.030662] 	(detected by 2, t=260007 jiffies, g=78835, c=78834, q=0)
> [ 2316.035890] trinity-watchdo R  running task    14336  1497   1496 0x00080000
> [ 2316.041153]  ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8800cd2db680
> [ 2316.046552]  ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000
> [ 2316.052039]  0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8
> [ 2316.057560] Call Trace:
> [ 2316.063066]  [<ffffffffa17fa0b0>] ? preempt_schedule_irq+0x40/0xa0
> [ 2316.068594]  [<ffffffffa17fa0b6>] preempt_schedule_irq+0x46/0xa0
> [ 2316.074034]  [<ffffffffa1801129>] retint_kernel+0x1b/0x2d
> [ 2316.079419]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.084778]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.090041]  [<ffffffffa10d675e>] ? lock_release+0x33e/0x700
> [ 2316.095219]  [<ffffffffa108be29>] ? group_send_sig_info+0x79/0x120
> [ 2316.100359]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.105489]  [<ffffffffa108c07b>] ? kill_pid_info+0x9b/0x150
> [ 2316.110584]  [<ffffffffa108c098>] kill_pid_info+0xb8/0x150
> [ 2316.115594]  [<ffffffffa108c000>] ? kill_pid_info+0x20/0x150
> [ 2316.120568]  [<ffffffffa108c244>] SYSC_kill+0xf4/0x2b0
> [ 2316.125476]  [<ffffffffa108c1ed>] ? SYSC_kill+0x9d/0x2b0
> [ 2316.130391]  [<ffffffffa10d354b>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> [ 2316.135348]  [<ffffffffa10d35ed>] ? trace_hardirqs_on+0xd/0x10
> [ 2316.140351]  [<ffffffffa1192ed3>] ? context_tracking_user_exit+0x13/0x20
> [ 2316.145399]  [<ffffffffa1012db7>] ? syscall_trace_enter_phase1+0xf7/0x150
> [ 2316.150413]  [<ffffffffa1001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> [ 2316.155461]  [<ffffffffa108e64e>] SyS_kill+0xe/0x10
> [ 2316.160499]  [<ffffffffa1800597>] entry_SYSCALL_64_fastpath+0x12/0x6f
> [ 2316.165609] trinity-watchdo R  running task    14336  1497   1496 0x00080000
> [ 2316.170851]  ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8800cd2db680
> [ 2316.176131]  ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000
> [ 2316.181443]  0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8
> [ 2316.186735] Call Trace:
> [ 2316.192021]  [<ffffffffa17fa0b0>] ? preempt_schedule_irq+0x40/0xa0
> [ 2316.197426]  [<ffffffffa17fa0b6>] preempt_schedule_irq+0x46/0xa0
> [ 2316.202818]  [<ffffffffa1801129>] retint_kernel+0x1b/0x2d
> [ 2316.208181]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.213541]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.218823]  [<ffffffffa10d675e>] ? lock_release+0x33e/0x700
> [ 2316.224042]  [<ffffffffa108be29>] ? group_send_sig_info+0x79/0x120
> [ 2316.229215]  [<ffffffffa10ad5e1>] ? get_parent_ip+0x11/0x50
> [ 2316.234346]  [<ffffffffa108c07b>] ? kill_pid_info+0x9b/0x150
> [ 2316.239491]  [<ffffffffa108c098>] kill_pid_info+0xb8/0x150
> [ 2316.244538]  [<ffffffffa108c000>] ? kill_pid_info+0x20/0x150
> [ 2316.249525]  [<ffffffffa108c244>] SYSC_kill+0xf4/0x2b0
> [ 2316.254503]  [<ffffffffa108c1ed>] ? SYSC_kill+0x9d/0x2b0
> [ 2316.259492]  [<ffffffffa10d354b>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> [ 2316.264565]  [<ffffffffa10d35ed>] ? trace_hardirqs_on+0xd/0x10
> [ 2316.269594]  [<ffffffffa1192ed3>] ? context_tracking_user_exit+0x13/0x20
> [ 2316.274681]  [<ffffffffa1012db7>] ? syscall_trace_enter_phase1+0xf7/0x150
> [ 2316.279772]  [<ffffffffa1001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> [ 2316.284873]  [<ffffffffa108e64e>] SyS_kill+0xe/0x10
> [ 2316.289981]  [<ffffffffa1800597>] entry_SYSCALL_64_fastpath+0x12/0x6f
> 
> some time later..
> 
> [66859.981825] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [66859.987582] 	Tasks blocked on level-0 rcu_node (CPUs 0-3): P20837
> [66859.993375] 	(detected by 2, t=65013 jiffies, g=5129960, c=5129959, q=0)
> [66859.999146] trinity-c255    R  running task    13728 20837  20581 0x00080002
> [66860.004980]  ffff8804d1417cb8 ffffffffa17fa0b0 ffff8804e4dab680 ffff8804ba2951c0
> [66860.010831]  ffff8804e4dab680 ffff8804d1417ca8 0000000000000000 ffff8804d1418000
> [66860.016666]  ffff88045f49a440 ffff8805047a0620 ffff880480fdf528 ffff8804d1417cd8
> [66860.022423] Call Trace:
> [66860.028116]  [<ffffffffa17fa0b0>] ? preempt_schedule_irq+0x40/0xa0
> [66860.033912]  [<ffffffffa17fa0b6>] preempt_schedule_irq+0x46/0xa0
> [66860.039663]  [<ffffffffa1801129>] retint_kernel+0x1b/0x2d
> [66860.045352]  [<ffffffffa10d6299>] ? lock_acquire+0xd9/0x260
> [66860.051044]  [<ffffffffa10ed3b1>] ? rcu_is_watching+0x1/0x60
> [66860.056725]  [<ffffffffa1212abc>] ? dput+0x10c/0x390
> [66860.062377]  [<ffffffffa12129d9>] ? dput+0x29/0x390
> [66860.068103]  [<ffffffffa11fb6f1>] __fput+0x181/0x200
> [66860.073797]  [<ffffffffa11fb576>] ? __fput+0x6/0x200
> [66860.079444]  [<ffffffffa11fb7be>] ____fput+0xe/0x10
> [66860.085070]  [<ffffffffa109dfdd>] task_work_run+0x8d/0xc0
> [66860.090699]  [<ffffffffa107cef6>] do_exit+0x416/0xc60
> [66860.096304]  [<ffffffffa1192ed3>] ? context_tracking_user_exit+0x13/0x20
> [66860.101978]  [<ffffffffa107ec64>] do_group_exit+0x54/0xe0
> [66860.107684]  [<ffffffffa107ed04>] SyS_exit_group+0x14/0x20
> [66860.113360]  [<ffffffffa1800597>] entry_SYSCALL_64_fastpath+0x12/0x6f
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ