lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEf4Bzaq5drHWChXoRBnrmkb6reAsSVj8r=uByFSup31FMA7hw@mail.gmail.com>
Date: Fri, 22 Aug 2025 14:23:03 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Xu Kuohai <xukuohai@...weicloud.com>
Cc: bpf@...r.kernel.org, linux-kselftest@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Yonghong Song <yhs@...com>, 
	Song Liu <song@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, 
	Jiri Olsa <jolsa@...nel.org>, Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>, 
	Stanislav Fomichev <sdf@...ichev.me>, Willem de Bruijn <willemb@...gle.com>, 
	Jason Xing <kerneljasonxing@...il.com>, Paul Chaignon <paul.chaignon@...il.com>, 
	Tao Chen <chen.dylane@...ux.dev>, Kumar Kartikeya Dwivedi <memxor@...il.com>, 
	Martin Kelly <martin.kelly@...wdstrike.com>
Subject: Re: [PATCH bpf-next 2/4] libbpf: ringbuf: Add overwrite ring buffer process

On Sun, Aug 3, 2025 at 7:27 PM Xu Kuohai <xukuohai@...weicloud.com> wrote:
>
> From: Xu Kuohai <xukuohai@...wei.com>
>
> In overwrite mode, the producer does not wait for the consumer, so the
> consumer is responsible for handling conflicts. An optimistic method
> is used to resolve the conflicts: the consumer first reads consumer_pos,
> producer_pos and overwrite_pos, then calculates a read window and copies
> data in the window from the ring buffer. After copying, it checks the
> positions to decide if the data in the copy window have been overwritten
> by be the producer. If so, it discards the copy and tries again. Once
> success, the consumer processes the events in the copy.
>

I don't mind adding BPF_F_OVERWRITE mode to BPF ringbuf (it seems like
this will work fine) itself, but I don't think retrofitting it to this
callback-based libbpf-side API is a good fit.

For one, I don't like that extra memory copy and potentially a huge
allocation that you do. I think for some use cases user logic might be
totally fine with using ringbuf memory directly, even if it can be
overwritten at any point. So it would be unfair to penalize
sophisticated users for such cases. Even if not, I'd say allocating
just enough to hold the record would be a better approach.

Another downside is that the user doesn't really have much visibility
right now into whether any samples were overwritten.

I've been mulling over the idea of adding an iterator-like API for BPF
ringbuf on the libbpf side for a while now. I'm still debating some
API nuances with Eduard, but I think we'll end up adding something
pretty soon. Iterator-based API seems like a much better fit for
overwritable mode here.

But all that is not really overwrite-specific and is broader, so I
think we can proceed with finalizing kernel-side details of overwrite
and not block on libbpf side of things for now, though.

> Signed-off-by: Xu Kuohai <xukuohai@...wei.com>
> ---
>  tools/lib/bpf/ringbuf.c | 103 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 102 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index 9702b70da444..9c072af675ff 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -27,10 +27,13 @@ struct ring {
>         ring_buffer_sample_fn sample_cb;
>         void *ctx;
>         void *data;
> +       void *read_buffer;
>         unsigned long *consumer_pos;
>         unsigned long *producer_pos;
> +       unsigned long *overwrite_pos;
>         unsigned long mask;
>         int map_fd;
> +       bool overwrite_mode;
>  };

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ