lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZCVk26InuXhy+Lmg@google.com>
Date:   Thu, 30 Mar 2023 11:30:51 +0100
From:   Vincent Donnefort <vdonnefort@...gle.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     mhiramat@...nel.org, linux-kernel@...r.kernel.org,
        linux-trace-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 1/2] ring-buffer: Introducing ring-buffer mapping
 functions

On Wed, Mar 29, 2023 at 11:32:34AM -0400, Steven Rostedt wrote:
> On Wed, 29 Mar 2023 14:55:41 +0100
> Vincent Donnefort <vdonnefort@...gle.com> wrote:
> 
> > > Yes, in fact it shouldn't need to call the ioctl until after it read it.
> > > 
> > > Maybe, we should have the ioctl take a parameter of how much was read?
> > > To prevent races?  
> > 
> > Races would only be with other consuming readers. In that case we'd probably
> > have many other problems anyway as I suppose nothing would prevent another one
> > of swapping the page while our userspace reader is still processing it?
> 
> I'm not worried about user space readers. I'm worried about writers, as
> the ioctl will update the reader_page->read = reader_page->commit. The time
> that the reader last read and stopped and then called the ioctl, a writer
> could fill the page, then the ioctl may even swap the page. By passing in
> the read amount, the ioctl will know if it needs to keep the same page or
> not.

How about?

userspace:

  prev_read = meta->read;
  ioctl(TRACE_MMAP_IOCTL_GET_READER_PAGE)

kernel:
    ring_buffer_get_reader_page()
      rb_get_reader_page(cpu_buffer);
      cpu_buffer->reader_page->read = rb_page_size(reader);
      meta->read = cpu_buffer->reader_page->read;

userspace:
   /* if new page prev_read = 0 */
   /* read between prev_read and meta->read */

If the writer does anything in-between, wouldn't rb_get_reader_page() handle it
nicely by returning the same reader as more would be there to read?

It is similar to rb_advance_reader() except we'd be moving several events at
once?

> 
> > 
> > I don't know if this is worth splitting the ABI between the meta-page and the
> > ioctl parameters for this?
> > 
> > Or maybe we should say the meta-page contains things modified by the writer and
> > parameters modified by the reader are passed by the get_reader_page ioctl i.e.
> > the reader page ID and cpu_buffer->reader_page->read? (for the hyp tracing, we
> > have up to 4 registers for the HVC which would replace in our case the ioctl)
> 
> I don't think we need the reader_page id, as that should never move without
> reader involvement. If there's more than one reader, that's up to the
> readers to keep track of each other, not the kernel.
> 
> Which BTW, the more I look at doing this without ioctls, I think we may
> need to update things slightly different.
> 
> I would keep the current approach, but for clarification of terminology, we
> have:
> 
> meta_data - the data that holds information that is shared between user and
> 	kernel space.
> 
> data_pages - this is a separate mapping that holds the mapped ring buffer
> 	pages. In user space, this is one contiguous array and also holds
> 	the reader page.
> 
> data_index - This is an array of what the writer sees. It maps the index
> 	into data_pages[] of where to find the mapped pages. It does not
> 	contain the reader page. We currently map this with the meta_data,
> 	but that's not a requirement (although we may continue to do so).
> 
> I'm thinking that we make the data_index[] elements into a structure:
> 
> struct trace_map_data_index {
> 	int		idx;	/* index into data_pages[] */
> 	int		cnt;	/* counter updated by writer */
> };
> 
> The cnt is initialized to zero when initially mapped.
> 
> Instead of having the bpage->id = index into data_pages[], have it equal
> the index into data_index[].
> 
> The cpu_buffer->reader_page->id = -1;
> 
> meta_data->reader_page = index into data_pages[] of reader page
> 
> The swapping of the header page would look something like this:
> 
> static inline void
> rb_meta_page_head_swap(struct ring_buffer_per_cpu *cpu_buffer)
> {
> 	struct ring_buffer_meta_page *meta = cpu_buffer->meta_page;
> 	int head_page;
> 
> 	if (!READ_ONCE(cpu_buffer->mapped))
> 		return;
> 
> 	head_page = meta->data_pages[meta->hdr.data_page_head];
> 	meta->data_pages[meta->hdr.data_page_head] = meta->hdr.reader_page;
> 	meta->hdr.reader_page = head_page;
> 	meta->data_pages[head_page]->id = -1;
> }
> 
> As hdr.data_page_head would be an index into data_index[] and not
> data_pages[].
> 
> The fact that bpage->id points to the data_index[] and not the data_pages[]
> means that the writer can easily get to that index, and modify the count.
> That way, in rb_tail_page_update() (between cmpxchgs) we can do something
> like:
> 
> 	if (cpu_buffer->mapped) {
> 		meta = cpu_buffer->meta_page;
> 		meta->data_index[next_page->id].cnt++;
> 	}
> 
> And this will allow the reader to know if the current page it is on just
> got overwritten by the writer, by doing:
> 
> 	prev_id = meta->data_index[this_page].cnt;
> 	smp_rmb();
> 	read event (copy it, whatever)
> 	smp_rmb();
> 	if (prev_id != meta->data_index[this_page].cnt)
> 		/* read data may be corrupted, abort it */

Couldn't the reader just check for the page commit field? rb_iter_head_event()
does something like this to check if the writer is on its page.

> 
> 
> Does this make sense?
> 
> -- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ