linux-kernel - Re: 2.6.31-rc1-mmotm0702 - ps command hangs inside kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46776.1247771564@turing-police.cc.vt.edu>
Date:	Thu, 16 Jul 2009 15:12:44 -0400
From:	Valdis.Kletnieks@...edu
To:	Stefani Seibold <stefani@...bold.net>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: 2.6.31-rc1-mmotm0702 - ps command hangs inside kernel

On Tue, 14 Jul 2009 07:31:19 +0200, Stefani Seibold said:
> Am Montag, den 13.07.2009, 14:38 -0700 schrieb Andrew Morton:
> > On Mon, 13 Jul 2009 16:54:51 -0400
> > Valdis.Kletnieks@...edu wrote:
> > 
> > > Several times recently, I've had the 'ps' command hang inside the kernel
> > > for extended periods of time - usually around 1100 seconds, but today I
> > > had one that hung there for 2351 seconds.

> i am the author of the get_stack_usage_bytes(). Because i have currently
> no 64bit machine running, i am not able to analyse your problem. Does it
> only happen on 32bit application on a 64bit kernel? Is it only affected
> to pcsd?

I've only seen it happen to pcscd.  However, most of the time it's one of
the very few 32-bit apps running on my laptop (I've got exactly *one* legacy
app for a secure-token that is stuck in 32-bit land).  So I can't tell if it's
a generic 32-bit issue.

> Can you give me a more accurate info what exactly the problem is?


> 
> > OK, thanks for the analysis.
> > 
> > > Here's the traceback of the ps, as reported by 2 alt-sysrq-t several
> > > minutes apart:
> > > 
> > > 
> > > ps            R  running task     3936 26646  26580 0x00000080
> > >  ffff88005a599bd8 ffffffff81499842 ffff88000213bf80 0000000000000000
> > >  ffff88005a599b78 ffffffff8103589b ffffffff81035805 ffff88007e71ea40
> > >  ffff88000213bf80 ffff8800657d8fe0 000000000000df78 ffff8800657d8fe8
> > > Call Trace:
> > >  [<ffffffff81499842>] ? thread_return+0xb6/0xfa
> > >  [<ffffffff8103589b>] ? finish_task_switch+0xd1/0xf4
> > >  [<ffffffff81035805>] ? finish_task_switch+0x3b/0xf4
> > >  [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
> > >  [<ffffffff8149b2a6>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> > >  [<ffffffff8101f76b>] smp_apic_timer_interrupt+0x81/0x8f
> > >  [<ffffffff81046231>] ? irq_exit+0xaf/0xb4
> > >  [<ffffffff8100bd80>] ? restore_args+0x0/0x30
> > >  [<ffffffff810b8222>] ? IS_ERR+0x25/0x2c
> > >  [<ffffffff810b8222>] ? IS_ERR+0x25/0x2c
> > >  [<ffffffff810b8f22>] ? follow_page+0x28/0x2e3
> > >  [<ffffffff810b91d9>] ? follow_page+0x2df/0x2e3
> > >  [<ffffffff811219d8>] ? proc_pid_status+0x5e0/0x694
> > >  [<ffffffff81066072>] ? trace_hardirqs_on+0xd/0xf
> > >  [<ffffffff8111e42c>] ? proc_single_show+0x57/0x74
> > >  [<ffffffff810e90b4>] ? seq_read+0x249/0x49b
> > >  [<ffffffff8116c840>] ? security_file_permission+0x11/0x13
> > >  [<ffffffff810d05e1>] ? vfs_read+0xe0/0x141
> > >  [<ffffffff810d8191>] ? path_put+0x1d/0x21
> > >  [<ffffffff810d06f8>] ? sys_read+0x45/0x69
> > >  [<ffffffff8100b2ab>] ? system_call_fastpath+0x16/0x1b
> > 
> The double follow_page looks strange for me... I will have a look on it.

It's possible that one of the two follow_page() entries is stale and just
happened to be left on the stack.  A large chunk of proc_pid_status() is
inlined, so it's possible that two calls were made and left their return
addresses in different locations on the stack.

I am pretty sure that follow_page+0x28 is the correct one, as I see it
in 2 more tracebacks today (see below)...

Got a better idea than just sticking in a printk() to dump the starting
values of stkpage and vma->vm_start in get_stack_usage_in_bytes()? I suspect
that for a 32-bit process, those 2 values aren't lined up the way we think
they are, and as a result that for loop ends up walking across a *lot* of
pages unintentionally.

> > Another possibility is that pcscd has gone off on a long in-kernel sulk
> > (polling hardware?) while holding some lock which ps needs (eg, mmap_sem).
> > 
> > It would be useful if you can grab a pcscd backtrace during the stall.

Got bit again this morning - here's the relevant tracebacks. Looks like pcscd
is off sulking in select() and nanosleep(), which are pretty normal places for
programs to go sulk.

(pcscd has 2 threads, apparently. I had 2 ps commands hung up, thus the
two entries for it)

pcscd         S 0000000000000000  4656  2100      1 0x00020080
 ffff88007da3f948 0000000000000046 ffff88007da3f8b8 ffffffff81052b2e
 ffff88007f90e340 ffff88007e08db30 0000000000000000 0000000000000001
 ffff88007da3f8d8 ffff880074ee4fa0 000000000000df78 ffff880074ee4fa0
Call Trace:
 [<ffffffff81052b2e>] ? queue_work_on+0x5e/0x6c
 [<ffffffff81056bec>] ? add_wait_queue+0x1b/0x42
 [<ffffffff8149a70b>] schedule_hrtimeout_range+0x3f/0x11f
 [<ffffffff8149bb91>] ? _spin_unlock_irqrestore+0x72/0x80
 [<ffffffff81056c0b>] ? add_wait_queue+0x3a/0x42
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff81066072>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff810de040>] poll_schedule_timeout+0x33/0x4c
 [<ffffffff810deb82>] do_select+0x4b7/0x4f5
 [<ffffffff810df064>] ? __pollwait+0x0/0xc7
 [<ffffffff810df12b>] ? pollwake+0x0/0x51
 [<ffffffff810a8d3a>] ? get_page_from_freelist+0x38a/0x63c
 [<ffffffff811bb5c4>] ? list_del+0xbc/0xea
 [<ffffffff810a7561>] ? __rmqueue+0x124/0x2bf
 [<ffffffff81034de2>] ? get_parent_ip+0x11/0x42
 [<ffffffff8149e770>] ? sub_preempt_count+0x35/0x49
 [<ffffffff810a8ed8>] ? get_page_from_freelist+0x528/0x63c
 [<ffffffff8105b908>] ? __update_sched_clock+0x2f/0x8e
 [<ffffffff8105bb72>] ? sched_clock_cpu+0x20b/0x219
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff81066072>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff810542bf>] ? alloc_pid+0x2ce/0x3e3
 [<ffffffff810542bf>] ? alloc_pid+0x2ce/0x3e3
 [<ffffffff81034de2>] ? get_parent_ip+0x11/0x42
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff81107bf7>] compat_core_sys_select+0x183/0x23d
 [<ffffffff8149e770>] ? sub_preempt_count+0x35/0x49
 [<ffffffff8149bb91>] ? _spin_unlock_irqrestore+0x72/0x80
 [<ffffffff81030930>] ? task_rq_unlock+0xc/0xe
 [<ffffffff8103ab25>] ? wake_up_new_task+0x169/0x174
 [<ffffffff8103eaf6>] ? do_fork+0x37a/0x424
 [<ffffffff8105e2e2>] ? current_kernel_time+0x28/0x50
 [<ffffffff81107f48>] compat_sys_select+0x96/0xbe
 [<ffffffff8107d8b0>] ? audit_syscall_entry+0x170/0x19c
 [<ffffffff8102b538>] sysenter_dispatch+0x7/0x2c

pcscd         S ffff88007e1fbf48  5640  2106      1 0x00020080
 ffff88007e1fbe28 0000000000000046 ffff88007e1fbd88 ffffffff8149bb91
 ffff88007e1fbe78 0000000000000001 ffff88007e1fbe18 ffffffff81059d91
 0000000000000000 ffff880074d6ac20 000000000000df78 ffff880074d6ac20
Call Trace:
 [<ffffffff8149bb91>] ? _spin_unlock_irqrestore+0x72/0x80
 [<ffffffff81059d91>] ? __hrtimer_start_range_ns+0x35b/0x36d
 [<ffffffff8149a882>] do_nanosleep+0x88/0xee
 [<ffffffff81059e75>] hrtimer_nanosleep+0xac/0x121
 [<ffffffff8105935d>] ? hrtimer_wakeup+0x0/0x21
 [<ffffffff81075da2>] compat_sys_nanosleep+0x7b/0xe1
 [<ffffffff8102b538>] sysenter_dispatch+0x7/0x2c
 [<ffffffff8149b2a6>] ? trace_hardirqs_on_thunk+0x3a/0x3f

ps            R  running task     3936 45836  45832 0x00000080
 ffff88004dc09b98 ffffffff81065f3f 0000000000000001 00000388525af000
 ffff88004dc09bb8 ffffffff81065f3f 0000000000000000 000003886bc2b000
 ffff88004dc09ce8 ffffffff8149b2a6 0000000000000000 ffff88000212cf68
Call Trace:
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff81065f3f>] trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff8149b2a6>] trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff8149b2a6>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8101f76b>] ? smp_apic_timer_interrupt+0x81/0x8f
 [<ffffffff810b8222>] ? IS_ERR+0x25/0x2c
 [<ffffffff810b8f22>] ? follow_page+0x28/0x2e3
 [<ffffffff811219d8>] proc_pid_status+0x5e0/0x694
 [<ffffffff81066072>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8111e42c>] proc_single_show+0x57/0x74
 [<ffffffff810e90b4>] seq_read+0x249/0x49b
 [<ffffffff8116c840>] ? security_file_permission+0x11/0x13
 [<ffffffff810d05e1>] vfs_read+0xe0/0x141
 [<ffffffff810d8191>] ? path_put+0x1d/0x21
 [<ffffffff810d06f8>] sys_read+0x45/0x69
 [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b

ps            R  running task     5584 45948  45868 0x00000080
 ffff88004de81bb8 0000000000000046 ffff88004de81b18 ffffffff81035142
 ffff880002120f80 0000000000000000 ffff88004de81b68 ffffffff8103589b
 ffffffff81035805 ffff8800653fc460 000000000000df78 ffff8800653fc468
Call Trace:
 [<ffffffff81035142>] ? mmdrop+0x2b/0x3d
 [<ffffffff8103589b>] ? finish_task_switch+0xd1/0xf4
 [<ffffffff81035805>] ? finish_task_switch+0x3b/0xf4
 [<ffffffff81499842>] ? thread_return+0xb6/0xfa
 [<ffffffff814999d1>] preempt_schedule_irq+0x56/0x74
 [<ffffffff8100be96>] retint_kernel+0x26/0x30
 [<ffffffff81094996>] ? ftrace_likely_update+0x12/0x14
 [<ffffffff8100bd80>] ? restore_args+0x0/0x30
 [<ffffffff810b8222>] IS_ERR+0x25/0x2c
 [<ffffffff810b8f22>] follow_page+0x28/0x2e3
 [<ffffffff81094990>] ? ftrace_likely_update+0xc/0x14
 [<ffffffff811219d8>] proc_pid_status+0x5e0/0x694
 [<ffffffff81066072>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8111e42c>] proc_single_show+0x57/0x74
 [<ffffffff810e90b4>] seq_read+0x249/0x49b
 [<ffffffff8116c840>] ? security_file_permission+0x11/0x13
 [<ffffffff810d05e1>] vfs_read+0xe0/0x141
 [<ffffffff810d8191>] ? path_put+0x1d/0x21
 [<ffffffff810d06f8>] sys_read+0x45/0x69
 [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b

Just for comparison, here's the pcscd stack traces right now, when things
are working just fine. One thread in select, one in nanosleep, just like
when it was broken.

pcscd         S 0000000000000000  4656  2000      1 0x00020080
 ffff880073037948 0000000000000046 ffff8800730378a8 ffffffff81030461
 0000000000000000 0000000000000000 ffff880000000000 ffffffff81499842
 ffff8800730378d8 ffff88007ebf9120 000000000000df78 ffff88007ebf9120
Call Trace:
 [<ffffffff81030461>] ? need_resched+0x3a/0x40
 [<ffffffff81499842>] ? thread_return+0xb6/0xfa
 [<ffffffff81056bec>] ? add_wait_queue+0x1b/0x42
 [<ffffffff8149a70b>] schedule_hrtimeout_range+0x3f/0x11f
 [<ffffffff8149bb91>] ? _spin_unlock_irqrestore+0x72/0x80
 [<ffffffff81056c0b>] ? add_wait_queue+0x3a/0x42
 [<ffffffff810df122>] ? __pollwait+0xbe/0xc7
 [<ffffffff810de040>] poll_schedule_timeout+0x33/0x4c
 [<ffffffff810deb82>] do_select+0x4b7/0x4f5
 [<ffffffff810df064>] ? __pollwait+0x0/0xc7
 [<ffffffff810df12b>] ? pollwake+0x0/0x51
 [<ffffffff810a8d3a>] ? get_page_from_freelist+0x38a/0x63c
 [<ffffffff811bb5c4>] ? list_del+0xbc/0xea
 [<ffffffff810a7561>] ? __rmqueue+0x124/0x2bf
 [<ffffffff81034de2>] ? get_parent_ip+0x11/0x42
 [<ffffffff8149e770>] ? sub_preempt_count+0x35/0x49
 [<ffffffff810a8ed8>] ? get_page_from_freelist+0x528/0x63c
 [<ffffffff81035805>] ? finish_task_switch+0x3b/0xf4
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff8149bc01>] ? _spin_unlock_irq+0x62/0x6f
 [<ffffffff8103589b>] ? finish_task_switch+0xd1/0xf4
 [<ffffffff81035805>] ? finish_task_switch+0x3b/0xf4
 [<ffffffff81030461>] ? need_resched+0x3a/0x40
 [<ffffffff81499842>] ? thread_return+0xb6/0xfa
 [<ffffffff81065f3f>] ? trace_hardirqs_on_caller+0x1f/0x145
 [<ffffffff81107bf7>] compat_core_sys_select+0x183/0x23d
 [<ffffffff81499a4d>] ? preempt_schedule+0x5e/0x67
 [<ffffffff8149bb9a>] ? _spin_unlock_irqrestore+0x7b/0x80
 [<ffffffff81030930>] ? task_rq_unlock+0xc/0xe
 [<ffffffff8103ab25>] ? wake_up_new_task+0x169/0x174
 [<ffffffff8103eaf6>] ? do_fork+0x37a/0x424
 [<ffffffff81107f48>] compat_sys_select+0x96/0xbe
 [<ffffffff8107d80b>] ? audit_syscall_entry+0xcb/0x19c
 [<ffffffff8102b538>] sysenter_dispatch+0x7/0x2c

pcscd         S ffff88007cf99f48  4896  2006      1 0x00020080
 ffff88007cf99e28 0000000000000046 ffff88007cf99d88 ffffffff8149bb91
 ffff88007cf99e78 0000000000000001 ffff88007cf99e18 ffffffff81059d91
 0000000000000000 ffff88007e4da3e0 000000000000df78 ffff88007e4da3e0
Call Trace:
 [<ffffffff8149bb91>] ? _spin_unlock_irqrestore+0x72/0x80
 [<ffffffff81059d91>] ? __hrtimer_start_range_ns+0x35b/0x36d
 [<ffffffff8149a882>] do_nanosleep+0x88/0xee
 [<ffffffff81059e75>] hrtimer_nanosleep+0xac/0x121
 [<ffffffff8105935d>] ? hrtimer_wakeup+0x0/0x21
 [<ffffffff81075da2>] compat_sys_nanosleep+0x7b/0xe1
 [<ffffffff8102b538>] sysenter_dispatch+0x7/0x2c
 [<ffffffff8149b2a6>] ? trace_hardirqs_on_thunk+0x3a/0x3f



Content of type "application/pgp-signature" skipped