[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240308202402.234176464@goodmis.org>
Date: Fri, 08 Mar 2024 15:24:02 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
joel@...lfernandes.org,
linke li <lilinke99@...com>,
Rabin Vincent <rabin@....in>
Subject: [PATCH v2 0/6] tracing/ring-buffer: Fix wakeup of ring buffer waiters
A patch was sent to "fix" the wait_index variable that is used to help with
waking of waiters on the ring buffer. The patch was rejected, but I started
looking at associated code. Discussing it on IRC with Mathieu Desnoyers
we discovered a design flaw.
The waiter reads "wait_index" then enters a "wait loop". After adding
itself to the wait queue, if the buffer is filled or a signal is pending
it will exit out. If wait_index is different, it will also exit out.
(Note, I noticed that the check for wait_index was after the schedule()
call and should have been before it, but that's besides the point).
The race is what happens if the waker updates the wait_index,
a new waiter comes in and reads the updated wait_index before it adds
itself to the queue, it will miss the update!
This is a series of changes to fix the design, and other bugs found
along the way.
1) Remove the wait loop in the ring_buffer_wait() code. It doesn't
have enough context to know if it should continue to wait.
2) Fix "shortest_full" accounting. When zero, it gets set to the
smallest percentage requested of the buffer before the writers need to
start waking up waiters. But after waiters are all woken up,
it never gets reset, so the smallest value will always be the
smallest value until the buffer itself is reset.
3) The wake up of readers on close was incorrectly added to the
.release() callback and not the .flush(). That is, waiters
in the .read() call, will never be woken up because .release()
isn't called until all .read()s have finished. Move the wakeup
to .flush() which is called immediately by close().
4) Add a "waking" field to the trace iterator that gets set when a
waker wants to wake up the readers. For the .flush() callback,
it is set and never cleared to make sure new readers do not block.
5/6) Break up ring_buffer_wait() into three functions:
ring_buffer_prepare_to_wait()
ring_buffer_wait()
ring_buffer_finish_wait()
This allows the caller to add itself to the wait queue, check
if its own condition has been set (in this case: iter->waking)
and then sleep. Follows the same semantics as any other wait logic.
Changes since v1: https://lore.kernel.org/lkml/20240308183816.676883229@goodmis.org/
- My tests triggered a warning about calling a mutex_lock() after a
prepare_to_wait() that changed the task's state. Convert the affected
mutex over to a spinlock.
Steven Rostedt (Google) (6):
ring-buffer: Fix waking up ring buffer readers
ring-buffer: Fix resetting of shortest_full
tracing: Use .flush() call to wake up readers
tracing: Fix waking up tracing readers
ring-buffer: Restructure ring_buffer_wait() to prepare for updates
tracing/ring-buffer: Fix wait_on_pipe() race
----
include/linux/ring_buffer.h | 4 +
include/linux/trace_events.h | 3 +-
kernel/trace/ring_buffer.c | 251 +++++++++++++++++++++++++++----------------
kernel/trace/trace.c | 131 ++++++++++++++++++----
4 files changed, 272 insertions(+), 117 deletions(-)
Powered by blists - more mailing lists