[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a16b5720-ad2d-44af-a231-e7aa172ccf0e@suse.com>
Date: Sun, 23 Jun 2024 16:54:56 +0200
From: Petr Pavlu <petr.pavlu@...e.com>
To: Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] ring-buffer: Limit time with disabled interrupts in
rb_check_pages()
On 6/21/24 17:09, Petr Pavlu wrote:
> Function rb_check_pages() validates the integrity of a specified per-CPU
> tracing ring buffer. It does so by walking the underlying linked
> list and checking its next and prev links.
>
> To guarantee that the list doesn't get modified during the check,
> a caller typically needs to take cpu_buffer->reader_lock. This prevents
> the check from running concurrently with a potential reader which can
> make the list temporarily inconsistent when swapping its old reader page
> into the buffer.
>
> A problem is that the time when interrupts are disabled is
> non-deterministic, dependent on the ring buffer size. This particularly
> affects PREEMPT_RT because the reader_lock is a raw spinlock which
> doesn't become sleepable on PREEMPT_RT kernels.
>
> Modify the check so it still tries to walk the whole list but gives up
> the reader_lock between checking individual pages.
Sorry, I'm afraid this patch is actually buggy. It considers only the
case when rb_check_pages() is called by ring_buffer_resize() and the
ring-buffer list is potentially modified by a reader swapping its page
into the buffer from rb_get_reader_page(). However, I suspect the
opposite can happen, rb_check_pages() called by a completing reader from
ring_buffer_read_finish() and rb_remove_pages()/rb_insert_pages()
concurrently modifying the ring-buffer pages. This needs to be handled
as well.
I'm at a conference next week, I'll have a closer look afterwards.
--
Petr
Powered by blists - more mailing lists