linux-kernel - Re: [PATCH] ring-buffer: Update "shortest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.22.394.2309300850320.3082@hadrien>
Date:   Sat, 30 Sep 2023 08:51:45 +0200 (CEST)
From:   Julia Lawall <julia.lawall@...ia.fr>
To:     Steven Rostedt <rostedt@...dmis.org>
cc:     LKML <linux-kernel@...r.kernel.org>,
        Linux trace kernel <linux-trace-kernel@...r.kernel.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH] ring-buffer: Update "shortest_full" in polling



On Fri, 29 Sep 2023, Steven Rostedt wrote:

> From: "Steven Rostedt (Google)" <rostedt@...dmis.org>
>
> It was discovered that the ring buffer polling was incorrectly stating
> that read would not block, but that's because polling did not take into
> account that reads will block if the "buffer-percent" was set. Instead,
> the ring buffer polling would say reads would not block if there was any
> data in the ring buffer. This was incorrect behavior from a user space
> point of view. This was fixed by commit 42fb0a1e84ff by having the polling
> code check if the ring buffer had more data than what the user specified
> "buffer percent" had.
>
> The problem now is that the polling code did not register itself to the
> writer that it wanted to wait for a specific "full" value of the ring
> buffer. The result was that the writer would wake the polling waiter
> whenever there was a new event. The polling waiter would then wake up, see
> that there's not enough data in the ring buffer to notify user space and
> then go back to sleep. The next event would wake it up again.
>
> Before the polling fix was added, the code would wake up around 100 times
> for a hackbench 30 benchmark. After the "fix", due to the constant waking
> of the writer, it would wake up over 11,0000 times! It would never leave
> the kernel, so the user space behavior was still "correct", but this
> definitely is not the desired effect.
>
> To fix this, have the polling code add what it's waiting for to the
> "shortest_full" variable, to tell the writer not to wake it up if the
> buffer is not as full as it expects to be.
>
> Note, after this fix, it appears that the waiter is now woken up around 2x
> the times it was before (~200). This is a tremendous improvement from the
> 11,000 times, but I will need to spend some time to see why polling is
> more aggressive in its wakeups than the read blocking code.

Actually, in my test it has gone from 276 wakeups in 6.0 to only 3 with
this patch.

I can do some more tests.

>
> Cc: stable@...r.kernel.org
> Fixes: 42fb0a1e84ff ("tracing/ring-buffer: Have polling block on watermark")
> Reported-by: Julia Lawall <julia.lawall@...ia.fr>
> Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>

Tested-by: Julia Lawall <julia.lawall@...ia.fr>

julia

> ---
>  kernel/trace/ring_buffer.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 28daf0ce95c5..515cafdb18d9 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1137,6 +1137,9 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
>  	if (full) {
>  		poll_wait(filp, &work->full_waiters, poll_table);
>  		work->full_waiters_pending = true;
> +		if (!cpu_buffer->shortest_full ||
> +		    cpu_buffer->shortest_full > full)
> +			cpu_buffer->shortest_full = full;
>  	} else {
>  		poll_wait(filp, &work->waiters, poll_table);
>  		work->waiters_pending = true;
> --
> 2.40.1
>
>