[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250604065814.a9df1d1944390d4c262852bd@kernel.org>
Date: Wed, 4 Jun 2025 06:58:14 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/2] tracing: ring_buffer: Rewind persistent ring
buffer when reboot
On Tue, 3 Jun 2025 12:00:49 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:
> On Fri, 23 May 2025 00:54:44 +0900
> "Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:
>
> > @@ -6642,7 +6739,6 @@ int ring_buffer_read_page(struct trace_buffer *buffer,
> > cpu_buffer->read_bytes += rb_page_size(reader);
> >
> > /* swap the pages */
> > - rb_init_page(bpage);
> > bpage = reader->page;
> > reader->page = data_page->data;
> > local_set(&reader->write, 0);
>
> Here's the bug we were looking for. We definitely need to keep the
> rb_init_page() here!
Ah, thanks for finding the bug!
>
> That's because this is the } else { part of the if condition that on true
> will do a copy and not a swap of the reader page.
>
> ring_buffer_read_page() has:
>
> if (read || (len < (commit - read)) ||
> cpu_buffer->reader_page == cpu_buffer->commit_page ||
> cpu_buffer->mapped) {
>
> // Copy the buffer to the passed in data_page
>
> } else {
>
> // swap the data_page with the reader page
>
> }
>
> The else part is here, and it's swapping the passed in data_page with the
> current reader_page. We have to initialize the data_page here.
Oops, I confused the "swap" meant reader <-> head, but that should be
done in rb_get_reader_page(), not here.
>
> What we see happening is:
>
> info->spare = ring_buffer_alloc_read_page(); <-- this is the data_page
>
> [..]
>
> ret = ring_buffer_read_page(..., info->spare, ...);
>
> Since this is a normal buffer, we swap the reader_page with info->spare,
> where info->spare now has the reader_page.
>
> It consumes the data via:
>
> trace_data = ring_buffer_read_data(info->spare);
>
> Then reads the buffer again:
>
> ret = ring_buffer_read_page(..., info->spare, ...);
>
> Now it hits the if statement again:
>
> if (read || (len < (commit - read)) ||
> cpu_buffer->reader_page == cpu_buffer->commit_page ||
> cpu_buffer->mapped) {
>
> // Copy the buffer to the passed in data_page
So all mapped buffers (including persistent ring buffer) pass
this block.
>
> } else {
>
> // swap the data_page with the reader page
>
> ---->> here it just swaps the last data_page with the current one.
So the data_page should not have any data. its commit should be 0.
>
> }
>
>
> As we never clear the "commit" part of the page, it still thinks it has
> content on the page as the "read" was set to zero, and it reads the old
> content again.
Got it.
>
> There's no reason not to clear the "commit" of the data_page passed in. It
> is not old data that is about to be lost. We most definitely need to call
> rb_init_page() on it.
Yeah, that data should be considered as cleared.
>
> After adding that back, trace-cmd record works properly again.
>
> Care to send a v4 without removing this rb_init_page(bpage); ?
OK, let me remove that (also remove timestamp check).
Thank you!
>
> -- Steve
--
Masami Hiramatsu (Google) <mhiramat@...nel.org>
Powered by blists - more mailing lists