[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84179edb-dba1-42fd-88a9-70d05d8915ec@suse.com>
Date: Fri, 11 Oct 2024 17:20:12 +0200
From: Petr Pavlu <petr.pavlu@...e.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Linux Trace Kernel <linux-trace-kernel@...r.kernel.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [PATCH] ring-buffer: Have the buffer update counter be atomic
On 10/11/24 16:48, Steven Rostedt wrote:
> On Fri, 11 Oct 2024 10:01:32 -0400
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
>>> Sorry for not replying to your last comment on my patch, I was ill.
>>>
>>> The member ring_buffer_per_cpu.cnt is intended to be accessed under the
>>> reader_lock, same as the pages pointer which it is tied to, so this
>>> change shouldn't be strictly needed.
>>>
>>
>> Right, but there was one location that the cnt was updated outside the
>> lock. The one I commented on. But instead of adding a lock around it, I
>> decided to just make it an atomic. Then there would be no need for the
>> lock. Hmm, it still needs a memory barrier though. At least a
>> smp_mb__after_atomic().
>
> Hmm, I don't like how the update in ring_buffer_subbuf_order_set() is
> unprotected. I think this is the proper patch:
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 696d422d5b35..0672df07b599 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -6774,6 +6774,7 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
> }
>
> for_each_buffer_cpu(buffer, cpu) {
> + unsigned long flags;
>
> if (!cpumask_test_cpu(cpu, buffer->cpumask))
> continue;
> @@ -6800,11 +6801,15 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
> struct buffer_page, list);
> list_del_init(&cpu_buffer->reader_page->list);
>
> + /* Synchronize with rb_check_pages() */
> + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
> +
> /* The cpu_buffer pages are a link list with no head */
> cpu_buffer->pages = cpu_buffer->new_pages.next;
> cpu_buffer->new_pages.next->prev = cpu_buffer->new_pages.prev;
> cpu_buffer->new_pages.prev->next = cpu_buffer->new_pages.next;
> cpu_buffer->cnt++;
> + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
>
> /* Clear the new_pages list */
> INIT_LIST_HEAD(&cpu_buffer->new_pages);
>
> Even though it's also protected by the buffer->mutex, I feel more
> comfortable with this.
Sorry, I missed context of your comment and that it is about
ring_buffer_subbuf_order_set().
I agree, I also noticed the missing locking in this function and it
looked to me as something that should be fixed. I happen to have
a somewhat more complex patch for it from a few months ago (pasted
below). I think I didn't send it to the list because I then noticed
other potential locking problems with the subbuf code, which I wanted to
examine more closely first.
--
Petr
>From 359f5aa523bc27ca8600b4efae471eefc0514ce0 Mon Sep 17 00:00:00 2001
From: Petr Pavlu <petr.pavlu@...e.com>
Date: Tue, 16 Jul 2024 11:35:00 +0200
Subject: [PATCH] ring-buffer: Fix reader locking when changing the sub buffer
order
The function ring_buffer_subbuf_order_set() updates each
ring_buffer_per_cpu and installs new sub buffers that match the requested
page order. This operation may be invoked concurrently with readers that
rely on some of the modified data, such as the head bit (RB_PAGE_HEAD), or
the ring_buffer_per_cpu.pages and reader_page pointers. However, no
exclusive access is acquired by ring_buffer_subbuf_order_set(). Modyfing
the mentioned data while a reader also operates on them can then result in
incorrect memory access and various crashes.
Fix the problem by taking the reader_lock when updating a specific
ring_buffer_per_cpu in ring_buffer_subbuf_order_set().
Fixes: 8e7b58c27b3c ("ring-buffer: Just update the subbuffers when changing their allocation order")
Signed-off-by: Petr Pavlu <petr.pavlu@...e.com>
---
kernel/trace/ring_buffer.c | 41 ++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 15 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 4c24191fa47d..d324f4f9ab9d 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -6773,36 +6773,38 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
}
for_each_buffer_cpu(buffer, cpu) {
+ struct buffer_data_page *old_free_data_page;
+ struct list_head old_pages;
+ unsigned long flags;
if (!cpumask_test_cpu(cpu, buffer->cpumask))
continue;
cpu_buffer = buffer->buffers[cpu];
+ raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
+
/* Clear the head bit to make the link list normal to read */
rb_head_page_deactivate(cpu_buffer);
- /* Now walk the list and free all the old sub buffers */
- list_for_each_entry_safe(bpage, tmp, cpu_buffer->pages, list) {
- list_del_init(&bpage->list);
- free_buffer_page(bpage);
- }
- /* The above loop stopped an the last page needing to be freed */
- bpage = list_entry(cpu_buffer->pages, struct buffer_page, list);
- free_buffer_page(bpage);
-
- /* Free the current reader page */
- free_buffer_page(cpu_buffer->reader_page);
+ /*
+ * Collect buffers from the cpu_buffer pages list and the
+ * reader_page on old_pages, so they can be freed later when not
+ * under a spinlock. The pages list is a linked list with no
+ * head, adding old_pages turns it into a regular list with
+ * old_pages being the head.
+ */
+ list_add(&old_pages, cpu_buffer->pages);
+ list_add(&cpu_buffer->reader_page->list, &old_pages);
/* One page was allocated for the reader page */
cpu_buffer->reader_page = list_entry(cpu_buffer->new_pages.next,
struct buffer_page, list);
list_del_init(&cpu_buffer->reader_page->list);
- /* The cpu_buffer pages are a link list with no head */
+ /* Install the new pages, remove the head from the list */
cpu_buffer->pages = cpu_buffer->new_pages.next;
- cpu_buffer->new_pages.next->prev = cpu_buffer->new_pages.prev;
- cpu_buffer->new_pages.prev->next = cpu_buffer->new_pages.next;
+ list_del(&cpu_buffer->new_pages);
cpu_buffer->cnt++;
/* Clear the new_pages list */
@@ -6815,11 +6817,20 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
cpu_buffer->nr_pages = cpu_buffer->nr_pages_to_update;
cpu_buffer->nr_pages_to_update = 0;
- free_pages((unsigned long)cpu_buffer->free_page, old_order);
+ old_free_data_page = cpu_buffer->free_page;
cpu_buffer->free_page = NULL;
rb_head_page_activate(cpu_buffer);
+ raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
+
+ /* Free old sub buffers */
+ list_for_each_entry_safe(bpage, tmp, &old_pages, list) {
+ list_del_init(&bpage->list);
+ free_buffer_page(bpage);
+ }
+ free_pages((unsigned long)old_free_data_page, old_order);
+
rb_check_pages(cpu_buffer);
}
Powered by blists - more mailing lists