lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <086e8f3e4a924849fb7f057961df29c2@codeaurora.org>
Date:   Thu, 02 Jul 2020 12:50:50 -0700
From:   rananta@...eaurora.org
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     mingo@...hat.com, psodagud@...eaurora.org,
        linux-kernel@...r.kernel.org
Subject: Re: Tracing: rb_head_page_deactivate() caught in an infinite loop

Hi Steven,

Thank you for responding.

On 2020-07-01 19:03, Steven Rostedt wrote:
> On Wed, 01 Jul 2020 10:07:06 -0700
> rananta@...eaurora.org wrote:
> 
>> Hi Steven and Mingo,
>> 
> 
> Hi Raghavendra,
> 
> 
>> While trying to adjust the buffer size (echo <size> >
>> /sys/kernel/debug/tracing/buffer_size_kb), we see that the kernel gets
>> caught up in an infinite loop
>> while traversing the "cpu_buffer->pages" list in
>> rb_head_page_deactivate().
>> 
>> Looks like the last node of the list could be uninitialized, thus
>> leading to infinite traversal. From the data that we captured:
>> 000|rb_head_page_deactivate(inline)
>>      |  cpu_buffer = 0xFFFFFF8000671600 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE600 -> (
>> ...
>>      |    pages = 0xFFFFFF80A909D980 =
>> kernel_size_le_lo32+0xFFFFFF65D811A980 -> (
>>      |      next = 0xFFFFFF80A909D200 =
>> kernel_size_le_lo32+0xFFFFFF65D811A200 -> (
>>      |        next = 0xFFFFFF80A909D580 =
>> kernel_size_le_lo32+0xFFFFFF65D811A580 -> (
>>      |          next = 0xFFFFFF8138D1CD00 =
>> kernel_size_le_lo32+0xFFFFFF6667D99D00 -> (
>>      |            next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>>      |              next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>>      |                next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>>      |                  next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0,
>> 
>> Wanted to check with you if there's any scenario that could lead us 
>> into
>> this state.
>> 
>> Test details:
>> -- Arch: arm64
>> -- Kernel version 5.4.30; running on Andriod
>> -- Test case: Running the following set of commands across reboot will
>> lead us to the scenario
>> 
>>    atrace --async_start -z -c -b 120000 sched audio irq idle freq
>>    < Run any workload here >
>>    atrace --async_dump -z -c -b 1200000 sched audio irq idle freq >
>> mytrace.trace
>>    atrace --async_stop > /dev/null
>>    echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb
>>    echo 200000 > /sys/kernel/debug/tracing/buffer_size_kb
>>    reboot
>> 
>> Repeating the above lines across reboots would reproduce the issue.
>> The "atrace" or "echo" would just get stuck while resizing the buffer
>> size.
> 
> What do you mean repeat across reboots? If it doesn't happen it wont
> ever happen, but if you reboot it may have it happen again?
> 
Actually, ignore the reboot part. I was able to reproduce with the 
reboot.
>> I'll try to reproduce the issue without atrace as well, but wondering
>> what could be the reason for leading us to this state.
> 
> I haven't used arm lately, and I'm unfamiliar with atrace. So I don't
> really know what is going on. If you can reproduce this with just a
> shell script accessing the ftrace files, that would be much more 
> useful.
> 
I tried to reproduce it without atrace, but couldn't. It's a simple 
wrapper utility
to collect ftraces. You can find the source here:
https://android.googlesource.com/platform/system/extras/+/jb-mr1-dev-plus-aosp/atrace/atrace.c
> Thanks,
> 
> -- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ