[<prev] [next>] [day] [month] [year] [list]
Message-ID: <a3d437ce-c91d-47c6-9590-88b716fb6690@linux.dev>
Date: Thu, 21 Aug 2025 14:48:53 -0700
From: Yonghong Song <yonghong.song@...ux.dev>
To: Maciej Żenczykowski <maze@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Linux Network Development Mailing List <netdev@...r.kernel.org>,
"David S . Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
BPF Mailing List <bpf@...r.kernel.org>, Stanislav Fomichev <sdf@...ichev.me>
Subject: Re: [PATCH bpf-next] bpf: hashtab - allow
BPF_MAP_LOOKUP{,_AND_DELETE}_BATCH with NULL keys/values.
On 8/20/25 7:23 PM, Maciej Żenczykowski wrote:
> On Mon, Aug 18, 2025 at 1:58 PM Yonghong Song
> <yonghong.song@...ux.dev> wrote:
> > On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> > > BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> > > seems like a nice way to simply quickly clear a map.
> >
> > This will change existing API as users will expect
> > some error (e.g., -EFAULT) return when keys or values is NULL.
>
> No reasonable user will call the current api with NULLs.
I do agree it is really unlikely users will have NULL keys or values.
>
> This is a similar API change to adding a new system call
> (where previously it returned -ENOSYS) - which *is* also a UAPI
> change, but obviously allowed.
>
> Or adding support for a new address family / protocol (where
> previously it -EAFNOSUPPORT)
> Or adding support for a new flag (where previously it returned -EINVAL)
>
> Consider why userspace would ever pass in NULL, two possibilities:
> (a) explicit NULL - you'd never do this since it would (till now)
> always -EFAULT,
> so this would only possibly show up in a very thorough test suite
> (b) you're using dynamically allocated memory and it failed allocation.
> that's already a program bug, you should catch that before you call
> bpf().
Okay. What you describes make sense.
Could you add a selftest for this?
Could you add some comments in below uapi bpf.h header to new functionality?
>
> > We have a 'flags' field in uapi header in
> >
> > struct { /* struct used by BPF_MAP_*_BATCH commands */
> > __aligned_u64 in_batch; /* start batch,
> > * NULL to start
> from beginning
> > */
> > __aligned_u64 out_batch; /* output: next
> start batch */
> > __aligned_u64 keys;
> > __aligned_u64 values;
> > __u32 count; /* input/output:
> > * input: # of
> key/value
> > * elements
> > * output: # of
> filled elements
> > */
> > __u32 map_fd;
> > __u64 elem_flags;
> > __u64 flags;
> > } batch;
> >
> > we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a
> comment
> > that if keys or values is NULL, the batched elements will be cleared.
>
> I just don't see what value this provides.
>
> > > BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> > > the values/keys and don't want to bother copying the keys/values...
> > >
> > > BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> > > the number of populated entries.
> >
> > bpf_map_lookup_elem() does not have flags field, so we probably
> should not
> > change existins semantics.
>
> This is unrelated to this patch, since this only touches 'batch'
> operation.
> (unless I'm missing something)
>
> > > Cc: Alexei Starovoitov <ast@...nel.org>
> > > Cc: Daniel Borkmann <daniel@...earbox.net>
> > > Cc: Stanislav Fomichev <sdf@...ichev.me>
> > > Signed-off-by: Maciej Żenczykowski <maze@...gle.com>
> > > ---
> > > kernel/bpf/hashtab.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > > index 5001131598e5..8fbdd000d9e0 100644
> > > --- a/kernel/bpf/hashtab.c
> > > +++ b/kernel/bpf/hashtab.c
> > > @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct
> bpf_map *map,
> > >
> > > rcu_read_unlock();
> > > bpf_enable_instrumentation();
> > > - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> > > + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total *
> key_size, keys,
> > > key_size * bucket_cnt) ||
> > > - copy_to_user(uvalues + total * value_size, values,
> > > + uvalues && copy_to_user(uvalues + total * value_size,
> values,
> > > value_size * bucket_cnt))) {
> > > ret = -EFAULT;
> > > goto after_loop;
> >
>
>
> --
> Maciej Żenczykowski, Kernel Networking Developer @ Google
Powered by blists - more mailing lists