[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 2 Feb 2022 10:36:16 +0800
From: Hou Tao <hotforest@...il.com>
To: andrii.nakryiko@...il.com
Cc: andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
daniel@...earbox.net, davem@...emloft.net, hotforest@...il.com,
houtao1@...wei.com, kafai@...com, kuba@...nel.org,
netdev@...r.kernel.org, yhs@...com
Subject: Re: [PATCH bpf-next] selftests/bpf: use getpagesize() to initialize ring buffer size
Hi,
> >
> > Hi Andrii,
> >
> > > >
> > > > 4096 is OK for x86-64, but for other archs with greater than 4KB
> > > > page size (e.g. 64KB under arm64), test_verifier for test case
> > > > "check valid spill/fill, ptr to mem" will fail, so just use
> > > > getpagesize() to initialize the ring buffer size. Do this for
> > > > test_progs as well.
> > > >
> > [...]
> >
> > > > diff --git a/tools/testing/selftests/bpf/progs/ima.c b/tools/testing/selftests/bpf/progs/ima.c
> > > > index 96060ff4ffc6..e192a9f16aea 100644
> > > > --- a/tools/testing/selftests/bpf/progs/ima.c
> > > > +++ b/tools/testing/selftests/bpf/progs/ima.c
> > > > @@ -13,7 +13,6 @@ u32 monitored_pid = 0;
> > > >
> > > > struct {
> > > > __uint(type, BPF_MAP_TYPE_RINGBUF);
> > > > - __uint(max_entries, 1 << 12);
> > >
> > > Should we just bump it to 64/128/256KB instead? It's quite annoying to
> > > do a split open and then load just due to this...
> > >
> > Agreed.
> >
> > > I'm also wondering if we should either teach kernel to round up to
> > > closes power-of-2 of page_size internally, or teach libbpf to do this
> > > for RINGBUF maps. Thoughts?
> > >
> > It seems that max_entries doesn't need to be page-aligned. For example
> > if max_entries is 4096 and page size is 65536, we can allocate a
> > 65536-sized page and set rb->mask 4095 and it will work. The only
> > downside is 60KB memory is waster, but it is the implementation
> > details and can be improved if subpage mapping can be supported.
> >
> > So how about removing the page-aligned restraint in kernel ?
> >
>
> No, if you read BPF ringbuf code carefully you'll see that we map the
> entire ringbuf data twice in the memory (see [0] for lame ASCII
> diagram), so that records that are wrapped at the end of the ringbuf
> and go back to the start are still accessible as a linear array. It's
> a very important guarantee, so it has to be page size multiple. But
> auto-increasing it to the closest power-of-2 of page size seems like a
> pretty low-impact change. Hard to imagine breaking anything except
> some carefully crafted tests for ENOSPC behavior.
>
Yes, i know the double map trick. What i tried to say is that:
(1) remove the page-aligned restrain for max_entries
(2) still allocate page-aligned memory for ringbuf
instead of rounding max_entries up to closest power-of-2 page size
directly, so max_entries from userspace is unchanged and double map trick
still works.
> [0] https://github.com/torvalds/linux/blob/master/kernel/bpf/ringbuf.c#L73-L89
> > Regards,
> > Tao
Powered by blists - more mailing lists