[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <086e8f3e4a924849fb7f057961df29c2@codeaurora.org>
Date: Thu, 02 Jul 2020 12:50:50 -0700
From: rananta@...eaurora.org
To: Steven Rostedt <rostedt@...dmis.org>
Cc: mingo@...hat.com, psodagud@...eaurora.org,
linux-kernel@...r.kernel.org
Subject: Re: Tracing: rb_head_page_deactivate() caught in an infinite loop
Hi Steven,
Thank you for responding.
On 2020-07-01 19:03, Steven Rostedt wrote:
> On Wed, 01 Jul 2020 10:07:06 -0700
> rananta@...eaurora.org wrote:
>
>> Hi Steven and Mingo,
>>
>
> Hi Raghavendra,
>
>
>> While trying to adjust the buffer size (echo <size> >
>> /sys/kernel/debug/tracing/buffer_size_kb), we see that the kernel gets
>> caught up in an infinite loop
>> while traversing the "cpu_buffer->pages" list in
>> rb_head_page_deactivate().
>>
>> Looks like the last node of the list could be uninitialized, thus
>> leading to infinite traversal. From the data that we captured:
>> 000|rb_head_page_deactivate(inline)
>> | cpu_buffer = 0xFFFFFF8000671600 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE600 -> (
>> ...
>> | pages = 0xFFFFFF80A909D980 =
>> kernel_size_le_lo32+0xFFFFFF65D811A980 -> (
>> | next = 0xFFFFFF80A909D200 =
>> kernel_size_le_lo32+0xFFFFFF65D811A200 -> (
>> | next = 0xFFFFFF80A909D580 =
>> kernel_size_le_lo32+0xFFFFFF65D811A580 -> (
>> | next = 0xFFFFFF8138D1CD00 =
>> kernel_size_le_lo32+0xFFFFFF6667D99D00 -> (
>> | next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>> | next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>> | next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> (
>> | next = 0xFFFFFF80006716F0 =
>> kernel_size_le_lo32+0xFFFFFF652F6EE6F0,
>>
>> Wanted to check with you if there's any scenario that could lead us
>> into
>> this state.
>>
>> Test details:
>> -- Arch: arm64
>> -- Kernel version 5.4.30; running on Andriod
>> -- Test case: Running the following set of commands across reboot will
>> lead us to the scenario
>>
>> atrace --async_start -z -c -b 120000 sched audio irq idle freq
>> < Run any workload here >
>> atrace --async_dump -z -c -b 1200000 sched audio irq idle freq >
>> mytrace.trace
>> atrace --async_stop > /dev/null
>> echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb
>> echo 200000 > /sys/kernel/debug/tracing/buffer_size_kb
>> reboot
>>
>> Repeating the above lines across reboots would reproduce the issue.
>> The "atrace" or "echo" would just get stuck while resizing the buffer
>> size.
>
> What do you mean repeat across reboots? If it doesn't happen it wont
> ever happen, but if you reboot it may have it happen again?
>
Actually, ignore the reboot part. I was able to reproduce with the
reboot.
>> I'll try to reproduce the issue without atrace as well, but wondering
>> what could be the reason for leading us to this state.
>
> I haven't used arm lately, and I'm unfamiliar with atrace. So I don't
> really know what is going on. If you can reproduce this with just a
> shell script accessing the ftrace files, that would be much more
> useful.
>
I tried to reproduce it without atrace, but couldn't. It's a simple
wrapper utility
to collect ftraces. You can find the source here:
https://android.googlesource.com/platform/system/extras/+/jb-mr1-dev-plus-aosp/atrace/atrace.c
> Thanks,
>
> -- Steve
Powered by blists - more mailing lists