[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250603120049.692d48b4@gandalf.local.home>
Date: Tue, 3 Jun 2025 12:00:49 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/2] tracing: ring_buffer: Rewind persistent ring
buffer when reboot
On Fri, 23 May 2025 00:54:44 +0900
"Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:
> @@ -6642,7 +6739,6 @@ int ring_buffer_read_page(struct trace_buffer *buffer,
> cpu_buffer->read_bytes += rb_page_size(reader);
>
> /* swap the pages */
> - rb_init_page(bpage);
> bpage = reader->page;
> reader->page = data_page->data;
> local_set(&reader->write, 0);
Here's the bug we were looking for. We definitely need to keep the
rb_init_page() here!
That's because this is the } else { part of the if condition that on true
will do a copy and not a swap of the reader page.
ring_buffer_read_page() has:
if (read || (len < (commit - read)) ||
cpu_buffer->reader_page == cpu_buffer->commit_page ||
cpu_buffer->mapped) {
// Copy the buffer to the passed in data_page
} else {
// swap the data_page with the reader page
}
The else part is here, and it's swapping the passed in data_page with the
current reader_page. We have to initialize the data_page here.
What we see happening is:
info->spare = ring_buffer_alloc_read_page(); <-- this is the data_page
[..]
ret = ring_buffer_read_page(..., info->spare, ...);
Since this is a normal buffer, we swap the reader_page with info->spare,
where info->spare now has the reader_page.
It consumes the data via:
trace_data = ring_buffer_read_data(info->spare);
Then reads the buffer again:
ret = ring_buffer_read_page(..., info->spare, ...);
Now it hits the if statement again:
if (read || (len < (commit - read)) ||
cpu_buffer->reader_page == cpu_buffer->commit_page ||
cpu_buffer->mapped) {
// Copy the buffer to the passed in data_page
} else {
// swap the data_page with the reader page
---->> here it just swaps the last data_page with the current one.
}
As we never clear the "commit" part of the page, it still thinks it has
content on the page as the "read" was set to zero, and it reads the old
content again.
There's no reason not to clear the "commit" of the data_page passed in. It
is not old data that is about to be lost. We most definitely need to call
rb_init_page() on it.
After adding that back, trace-cmd record works properly again.
Care to send a v4 without removing this rb_init_page(bpage); ?
-- Steve
Powered by blists - more mailing lists