lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250327220106.37c921ea@batman.local.home>
Date: Thu, 27 Mar 2025 22:01:06 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Masami Hiramatsu
 <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Feng Yang <yangfeng@...inos.cn>, Jiapeng Chong
 <jiapeng.chong@...ux.alibaba.com>
Subject: Re: [GIT PULL] ring-buffer: Updates for v6.15

On Thu, 27 Mar 2025 18:31:55 -0700
Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Thu, 27 Mar 2025 at 18:24, Steven Rostedt <rostedt@...dmis.org> wrote:
> >
> > The pages are never vmalloc'd, it's only ever vmap()'d on top of
> > contiguous physical memory or allocated via alloc_page() in the order
> > given. Thus, we do not support non consecutive physical memory.  
> 
> Christ, that just makes it EVEN WORSE.\

Let me explain this better. Yes it uses alloc_page() but that's just an
intermediate function. What is saved is the page_address(page), as that
is what is used by the code. That is, the allocation uses alloc_page()
but then immediately converted to page_address() to use. The page
itself is just an temp variable to get contiguous memory, as the
sub-buffers must be contiguous.

> 
> Just keep track of the actual original physical allocation, then!
> 
> By all means vmap it too for whoever wants the virtual allocation, but
> remember the *real* allocation, and keep it as a 'struct page'
> together with the order that you already have.

The virtual address needs to be created as that's all the code cares
about. For the case where the physical memory is passed in, it requires
a vmap(). The struct page is only needed when the ring buffer is going
to be mmapped to user space.

> 
> And then you never use vmalloc_to_page() - or even virt_to_page() - at
> all, because you actually know your base allocation, and keep it in
> that form that so much of this code wants in the first place.
> 
> Having a a nice reliable 'struct page *' (together with that size
> order) and keeping it in that form would be *so* much cleaner.
> 
> Instead of randomly translating it to (two different!) kinds of
> virtual kernel addresses and then translating it back when you wanted
> the original proper format.

By moving the physical mapping of the code into the ring buffer code, I
can set a flag for the buffer to mark it as being mapped directly, and
we can save the struct pages for this location (easily found by the
address). Then to get the page, it could be a simple helper function of:

struct page *rb_get_page(struct trace_buffer *buffer, unsigned long addr)
{
	if (buffer->flags & RB_FL_PHYSICAL) {
		addr -= buffer->vmap_start;
		addr += buffer->phys_start;
		return pfn_to_page(addr >> PAGE_SHIFT);
	}
	return virt_to_page(addr);
}



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ