lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 29 Mar 2023 11:32:34 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Vincent Donnefort <vdonnefort@...gle.com>
Cc:     mhiramat@...nel.org, linux-kernel@...r.kernel.org,
        linux-trace-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 1/2] ring-buffer: Introducing ring-buffer mapping
 functions

On Wed, 29 Mar 2023 14:55:41 +0100
Vincent Donnefort <vdonnefort@...gle.com> wrote:

> > Yes, in fact it shouldn't need to call the ioctl until after it read it.
> > 
> > Maybe, we should have the ioctl take a parameter of how much was read?
> > To prevent races?  
> 
> Races would only be with other consuming readers. In that case we'd probably
> have many other problems anyway as I suppose nothing would prevent another one
> of swapping the page while our userspace reader is still processing it?

I'm not worried about user space readers. I'm worried about writers, as
the ioctl will update the reader_page->read = reader_page->commit. The time
that the reader last read and stopped and then called the ioctl, a writer
could fill the page, then the ioctl may even swap the page. By passing in
the read amount, the ioctl will know if it needs to keep the same page or
not.

> 
> I don't know if this is worth splitting the ABI between the meta-page and the
> ioctl parameters for this?
> 
> Or maybe we should say the meta-page contains things modified by the writer and
> parameters modified by the reader are passed by the get_reader_page ioctl i.e.
> the reader page ID and cpu_buffer->reader_page->read? (for the hyp tracing, we
> have up to 4 registers for the HVC which would replace in our case the ioctl)

I don't think we need the reader_page id, as that should never move without
reader involvement. If there's more than one reader, that's up to the
readers to keep track of each other, not the kernel.

Which BTW, the more I look at doing this without ioctls, I think we may
need to update things slightly different.

I would keep the current approach, but for clarification of terminology, we
have:

meta_data - the data that holds information that is shared between user and
	kernel space.

data_pages - this is a separate mapping that holds the mapped ring buffer
	pages. In user space, this is one contiguous array and also holds
	the reader page.

data_index - This is an array of what the writer sees. It maps the index
	into data_pages[] of where to find the mapped pages. It does not
	contain the reader page. We currently map this with the meta_data,
	but that's not a requirement (although we may continue to do so).

I'm thinking that we make the data_index[] elements into a structure:

struct trace_map_data_index {
	int		idx;	/* index into data_pages[] */
	int		cnt;	/* counter updated by writer */
};

The cnt is initialized to zero when initially mapped.

Instead of having the bpage->id = index into data_pages[], have it equal
the index into data_index[].

The cpu_buffer->reader_page->id = -1;

meta_data->reader_page = index into data_pages[] of reader page

The swapping of the header page would look something like this:

static inline void
rb_meta_page_head_swap(struct ring_buffer_per_cpu *cpu_buffer)
{
	struct ring_buffer_meta_page *meta = cpu_buffer->meta_page;
	int head_page;

	if (!READ_ONCE(cpu_buffer->mapped))
		return;

	head_page = meta->data_pages[meta->hdr.data_page_head];
	meta->data_pages[meta->hdr.data_page_head] = meta->hdr.reader_page;
	meta->hdr.reader_page = head_page;
	meta->data_pages[head_page]->id = -1;
}

As hdr.data_page_head would be an index into data_index[] and not
data_pages[].

The fact that bpage->id points to the data_index[] and not the data_pages[]
means that the writer can easily get to that index, and modify the count.
That way, in rb_tail_page_update() (between cmpxchgs) we can do something
like:

	if (cpu_buffer->mapped) {
		meta = cpu_buffer->meta_page;
		meta->data_index[next_page->id].cnt++;
	}

And this will allow the reader to know if the current page it is on just
got overwritten by the writer, by doing:

	prev_id = meta->data_index[this_page].cnt;
	smp_rmb();
	read event (copy it, whatever)
	smp_rmb();
	if (prev_id != meta->data_index[this_page].cnt)
		/* read data may be corrupted, abort it */


Does this make sense?

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ