[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <56CDF609.7010506@fb.com>
Date: Wed, 24 Feb 2016 13:27:21 -0500
From: Josef Bacik <jbacik@...com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: <linux-kernel@...r.kernel.org>, <kernel-team@...com>
Subject: Re: [PATCH] trace-cmd: use nonblocking reads for streaming
On 02/23/2016 06:17 PM, Steven Rostedt wrote:
> On Thu, 17 Dec 2015 12:01:52 -0500
> Josef Bacik <jbacik@...com> wrote:
>
>> I noticed while using the streaming infrastructure in trace-cmd that I was
>> seemingly missing events. Using other tracing methods I got these events and
>> record->missed_events was never being set. This is because the streaming
>> infrastructure uses blocking reads on the per cpu trace pipe's, which means
>> we'll wait for an entire pages worth of data to be ready before passing it along
>> to the recorder. This makes it impossible to do long term tracing that requires
>> coupling two different events that could occur on different CPU's, and I imagine
>> it has been what is screwing up my trace-cmd profile runs on our giant 40 cpu
>> boxes. Fix trace-cmd instead to use a nonblocking read with select to wait for
>> data on the pipe so we don't burn CPU unnecessarily. With this patch I'm no
>> longer seeing missed events in my app. Thanks,
>
> I just want to make sure I understand what is happening here.
>
> This wasn't trace-cmd's default code right? This was your own app. And
> I'm guessing you were matching events perhaps. That is, after seeing
> some event, you looked for the other event. But if that event happened
> on a CPU that isn't very active, it would wait forever, as the read
> was waiting for a full page?
>
> Or is there something else.
>
> I don't have a problem with the patch. I just want to understand the
> issue.
Yup I had an app that was watching block request issue and completion
events, and occasionally a completion event would happen on some mostly
idle cpu, so I wouldn't get the completion request until several hours
later (the app runs all the time) when we finally had a full page to
read from that cpu's buffer. Thanks,
Josef
Powered by blists - more mailing lists