[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.22.394.2309300850320.3082@hadrien>
Date: Sat, 30 Sep 2023 08:51:45 +0200 (CEST)
From: Julia Lawall <julia.lawall@...ia.fr>
To: Steven Rostedt <rostedt@...dmis.org>
cc: LKML <linux-kernel@...r.kernel.org>,
Linux trace kernel <linux-trace-kernel@...r.kernel.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH] ring-buffer: Update "shortest_full" in polling
On Fri, 29 Sep 2023, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@...dmis.org>
>
> It was discovered that the ring buffer polling was incorrectly stating
> that read would not block, but that's because polling did not take into
> account that reads will block if the "buffer-percent" was set. Instead,
> the ring buffer polling would say reads would not block if there was any
> data in the ring buffer. This was incorrect behavior from a user space
> point of view. This was fixed by commit 42fb0a1e84ff by having the polling
> code check if the ring buffer had more data than what the user specified
> "buffer percent" had.
>
> The problem now is that the polling code did not register itself to the
> writer that it wanted to wait for a specific "full" value of the ring
> buffer. The result was that the writer would wake the polling waiter
> whenever there was a new event. The polling waiter would then wake up, see
> that there's not enough data in the ring buffer to notify user space and
> then go back to sleep. The next event would wake it up again.
>
> Before the polling fix was added, the code would wake up around 100 times
> for a hackbench 30 benchmark. After the "fix", due to the constant waking
> of the writer, it would wake up over 11,0000 times! It would never leave
> the kernel, so the user space behavior was still "correct", but this
> definitely is not the desired effect.
>
> To fix this, have the polling code add what it's waiting for to the
> "shortest_full" variable, to tell the writer not to wake it up if the
> buffer is not as full as it expects to be.
>
> Note, after this fix, it appears that the waiter is now woken up around 2x
> the times it was before (~200). This is a tremendous improvement from the
> 11,000 times, but I will need to spend some time to see why polling is
> more aggressive in its wakeups than the read blocking code.
Actually, in my test it has gone from 276 wakeups in 6.0 to only 3 with
this patch.
I can do some more tests.
>
> Cc: stable@...r.kernel.org
> Fixes: 42fb0a1e84ff ("tracing/ring-buffer: Have polling block on watermark")
> Reported-by: Julia Lawall <julia.lawall@...ia.fr>
> Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>
Tested-by: Julia Lawall <julia.lawall@...ia.fr>
julia
> ---
> kernel/trace/ring_buffer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 28daf0ce95c5..515cafdb18d9 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1137,6 +1137,9 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
> if (full) {
> poll_wait(filp, &work->full_waiters, poll_table);
> work->full_waiters_pending = true;
> + if (!cpu_buffer->shortest_full ||
> + cpu_buffer->shortest_full > full)
> + cpu_buffer->shortest_full = full;
> } else {
> poll_wait(filp, &work->waiters, poll_table);
> work->waiters_pending = true;
> --
> 2.40.1
>
>
Powered by blists - more mailing lists