linux-hardening - Re: [PATCH v2 1/2] tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <db2123c9-4777-4cc5-a00f-3df78edf5cb7@efficios.com>
Date: Mon, 31 Mar 2025 22:23:00 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>, Jann Horn <jannh@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 Masami Hiramatsu <mhiramat@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Vincent Donnefort <vdonnefort@...gle.com>, Vlastimil Babka <vbabka@...e.cz>,
 Mike Rapoport <rppt@...nel.org>, Kees Cook <kees@...nel.org>,
 Tony Luck <tony.luck@...el.com>, "Guilherme G. Piccoli"
 <gpiccoli@...lia.com>, linux-hardening@...r.kernel.org,
 Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v2 1/2] tracing: ring-buffer: Have the ring buffer code do
 the vmap of physical memory

On 2025-03-31 21:50, Steven Rostedt wrote:
> On Tue, 1 Apr 2025 03:28:20 +0200
> Jann Horn <jannh@...gle.com> wrote:
> 
>> I think you probably need flushes on both sides, since you might have
>> to first flush out the dirty cacheline you wrote through the kernel
>> mapping, then discard the stale clean cacheline for the user mapping,
>> or something like that? (Unless these VIVT cache architectures provide
>> stronger guarantees on cache state than I thought.) But when you're
>> adding data to the tracing buffers, I guess maybe you only want to
>> flush the kernel mapping from the kernel, and leave flushing of the
>> user mapping to userspace? I think if you're running in some random
>> kernel context, you probably can't even reliably flush the right
>> userspace context - see how for example vivt_flush_cache_range() does
>> nothing if the MM being flushed is not running on the current CPU.
> 
> I'm assuming I need to flush both the kernel (get the updates out to
> memory) and user space (so it can read those updates).
> 
> The paths are all done via system calls from user space, so it should be on
> the same CPU. User space will do an ioctl() on the buffer file descriptor
> asking for an update, the kernel will populate the page with that update,
> and then user space will read the update after the ioctl() returns. All
> very synchronous. Thus, we don't need to worry about updates from one CPU
> happening on another CPU.
> 
> Even when it wants to read the buffer. The ioctl() will swap out the old
> reader page with one of the write pages making it the new "reader" page,
> where no more updates will happen on that page. The flush happens after
> that and before going back to user space.
FWIW, I have the following in the LTTng kernel tracer to cover this.
LTTng writes to ring buffers through the linear mapping, and reads
the buffers from userspace either through mmap or splice.

When userspace wants to get read access to a sub-buffer (an abstraction
that generalizes your ftrace ring buffer "pages") through mmap,
it does the following through a "get subbuffer" ioctl:

- Use "cpu_dcache_is_aliasing()" to check whether explicit flushing is
   needed between the kernel linear mapping and userspace mappings.

- Use flush_dcache_page() to make sure all mappings for a given page
   are flushed (both the kernel linear mapping and the userspace virtual
   mappings).

I suspect that if you go down the route of the explicit
"flush_cache_range()", then you'll need to issue it on all
mappings that alias your memory.

AFAIU, using flush_dcache_page() saves you the trouble of issuing
flush_cache_range() on all mapping aliases manually.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com