lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 17 Aug 2016 16:23:51 -0400
From:	Robert Foss <robert.foss@...labora.com>
To:	Josh Poimboeuf <jpoimboe@...hat.com>
Cc:	Peter Zijlstra <peterz@...radead.org>, tglx@...utronix.de,
	mingo@...hat.com, hpa@...or.com, x86@...nel.org,
	davem@...emloft.net, yinghan@...gle.com, ast@...com,
	linux@...musvillemoes.dk, borntraeger@...ibm.com,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Bresticker <abrestic@...omium.org>,
	Sameer Nanda <snanda@...omium.org>,
	Filipe Brandenburger <filbranden@...omium.org>
Subject: Re: [PACTH v9] stacktrace: Eliminate task stack trace duplication



On 2016-08-17 03:57 PM, Josh Poimboeuf wrote:
> On Wed, Aug 17, 2016 at 02:41:30PM -0400, Robert Foss wrote:
>> On 2016-08-17 02:06 PM, Josh Poimboeuf wrote:
>>> On Wed, Aug 17, 2016 at 01:40:33PM -0400, Robert Foss wrote:
>>>> On 2016-08-17 12:58 PM, Josh Poimboeuf wrote:
>>>>> On Wed, Aug 17, 2016 at 09:51:45AM -0400, Robert Foss wrote:
>>>>>> On 2016-08-17 02:50 AM, Peter Zijlstra wrote:
>>>>>>> On Tue, Aug 16, 2016 at 07:12:36PM -0400, robert.foss@...labora.com wrote:
>>>>>>>> From: Ying Han <yinghan@...gle.com>
>>>>>>>>
>>>>>>>> The problem with small dmesg ring buffer like 512k is that only limited number
>>>>>>>> of task traces will be logged. Sometimes we lose important information only
>>>>>>>> because of too many duplicated stack traces. This problem occurs when dumping
>>>>>>>> lots of stacks in a single operation, such as sysrq-T.
>>>>>>>>
>>>>>>>> This patch tries to reduce the duplication of task stack trace in the dump
>>>>>>>> message by hashing the task stack. The hashtable is a 32k pre-allocated buffer
>>>>>>>> during bootup. Each time if we find the identical task trace in the task stack,
>>>>>>>> we dump only the pid of the task which has the task trace dumped. So it is easy
>>>>>>>> to back track to the full stack with the pid.
>>>>>>>>
>>>>>>>> When we do the hashing, we eliminate garbage entries from stack traces. Those
>>>>>>>> entries are still being printed in the dump to provide more debugging
>>>>>>>> informations.
>>>>>>>>
>>>>>>>> [   53.510162] kworker/0:0     S ffffffff8161d820     0     4      2 0x00000000
>>>>>>>> [   53.517237]  ffff88027547de60 0000000000000046 ffffffff812ab840 0000000000000000
>>>>>>>> [   53.524663]  ffff880275460080 ffff88027547dfd8 ffff88027547dfd8 ffff88027547dfd8
>>>>>>>> [   53.532092]  ffffffff81813020 ffff880275460080 0000000000000000 ffff8808758670c0
>>>>>>>> [   53.539521] Call Trace:
>>>>>>>> [   53.541974]  [<ffffffff812ab840>] ? cfq_init_queue+0x350/0x350
>>>>>>>> [   53.547791]  [<ffffffff81524d49>] schedule+0x29/0x70
>>>>>>>> [   53.552761]  [<ffffffff810945a3>] worker_thread+0x233/0x380
>>>>>>>> [   53.558318]  [<ffffffff81094370>] ? manage_workers.isra.28+0x230/0x230
>>>>>>>> [   53.564839]  [<ffffffff81099a73>] kthread+0x93/0xa0
>>>>>>>> [   53.569714]  [<ffffffff8152e6d4>] kernel_thread_helper+0x4/0x10
>>>>>>>> [   53.575628]  [<ffffffff810999e0>] ? kthread_worker_fn+0x140/0x140
>>>>>>>> [   53.581714]  [<ffffffff8152e6d0>] ? gs_change+0xb/0xb
>>>>>>>> [   53.586762] kworker/u:0     S ffffffff8161d820     0     5      2 0x00000000
>>>>>>>> [   53.593858]  ffff88027547fe60 0000000000000046 ffffffffa005cc70 0000000000000000
>>>>>>>> [   53.601307]  ffff8802754627d0 ffff88027547ffd8 ffff88027547ffd8 ffff88027547ffd8
>>>>>>>> [   53.608788]  ffffffff81813020 ffff8802754627d0 0000000000011fc0 ffff8804758670c0
>>>>>>>> [   53.616232] Call Trace:
>>>>>>>> [   53.618676] <Same stack as pid 4>
>>>>>>>>
>>>>>>>
>>>>>>> You might want to wait a bit and have a look at this:
>>>>>>>
>>>>>>>   https://lkml.kernel.org/r/cover.1471011425.git.jpoimboe@redhat.com
>>>>>>>
>>>>>>
>>>>>> I'll have a look through that series!
>>>>>> Thanks!
>>>>>
>>>>> Yeah, those patches replace dump_trace() with a new unwinder interface,
>>>>> so if they get merged, this will need to be rewritten a little bit.
>>>>>
>>>>> As for the patch itself, I'm not crazy about how it pushes the decision
>>>>> of whether to print the stack of a given task down to the stack dump
>>>>> code in show_trace_log_lvl().
>>>>>
>>>>> I think I'd prefer to instead change the implementation of sysrq-T so
>>>>> that it uses save_stack_trace_tsk(), and then uses
>>>>> printk_stack_address() to print the stack.  Then the stack dump code in
>>>>> dumpstack*.c would be completely unaffected.
>>>>>
>>>>> Or, even better, instead of sysrq-T, can the user just read
>>>>> /proc/*/{comm,stack} and /proc/sched_debug?  That gives basically the
>>>>> same information without flooding printk.
>>>>>
>>>>
>>>> Thanks for the feedback Josh!
>>>>
>>>> I think the save_stack_trace_tsk() changes you are suggesting sound very
>>>> reasonable. However requiring the user to read /proc/*/{comm,stack} sort of
>>>> circumnavigates the goal of the patch, which is to reduce clutter in the
>>>> default stack traces that one encounters.
>>>
>>> Yes, but maybe the hashing and deduplication of stacks could also be
>>> done in user space?
>>>
>>
>> What would that look like in practice? A user space daemon running in the
>> background?
>
> The idea was for the user of sysrq-T to instead get the stack
> information from /proc.  Then they wouldn't have problems with the
> printk buffer wrapping.  If after doing that, the dedupe is still needed
> for some reason, the application which does the reads from /proc could
> also do the dedupe.

Ah! Now I understand. Thanks for spelling it out :)

The dedup would be helpful for as long as there is duplication in stack 
traces. And the dedup would be helpful to any application. Making every 
application that interacts with stack traces implement deduplication 
seems to me like going about things the wrong way,

But given that stack traces currently are being re-worked it may not be 
helpful after that series has landed.

>
> Or am I misunderstanding the problem?  If so, it might help to spell out
> the problem you're trying to solve with more specifics and explain why
> it can't be solved in user space.
>

Nope, you are on point.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ