[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ed28273-a716-4638-912d-f86f965e54bb@linux.dev>
Date: Fri, 11 Apr 2025 15:10:51 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Jordan Rife <jordan@...fe.io>
Cc: Aditi Ghag <aditi.ghag@...valent.com>,
Daniel Borkmann <daniel@...earbox.net>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, bpf@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH v2 bpf-next 2/5] bpf: udp: Propagate ENOMEM up from
bpf_iter_udp_batch
On 4/11/25 10:35 AM, Jordan Rife wrote:
> Stop iteration if the current bucket can't be contained in a single
> batch to avoid choosing between skipping or repeating sockets. In cases
> where none of the saved cookies can be found in the current bucket and
Since the cookie change is in the next patch, it will be useful to mention it is
a prep work for the next patch.
> the batch isn't big enough to contain all the sockets in the bucket,
> there are really only two choices, neither of which is desirable:
>
> 1. Start from the beginning, assuming we haven't seen any sockets in the
> current bucket, in which case we might repeat a socket we've already
> seen.
> 2. Go to the next bucket to avoid repeating a socket we may have already
> seen, in which case we may accidentally skip a socket that we didn't
> yet visit.
>
> To avoid this tradeoff, enforce the invariant that the batch always
> contains a full snapshot of the bucket from last time by returning
> -ENOMEM if bpf_iter_udp_realloc_batch() can't grab enough memory to fit
> all sockets in the current bucket.
>
> To test this code path, I forced bpf_iter_udp_realloc_batch() to return
> -ENOMEM when called from within bpf_iter_udp_batch() and observed that
> read() fails in userspace with errno set to ENOMEM. Otherwise, it's a
> bit hard to test this scenario.
>
> Link: https://lore.kernel.org/bpf/CABi4-ogUtMrH8-NVB6W8Xg_F_KDLq=yy-yu-tKr2udXE2Mu1Lg@mail.gmail.com/
>
> Signed-off-by: Jordan Rife <jordan@...fe.io>
> Reviewed-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> ---
> net/ipv4/udp.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 59c3281962b9..1e8ae08d24db 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -3410,6 +3410,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
> unsigned int batch_sks = 0;
> bool resized = false;
> struct sock *sk;
> + int err = 0;
>
> resume_bucket = state->bucket;
> resume_offset = iter->offset;
> @@ -3475,7 +3476,11 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
> iter->st_bucket_done = true;
> goto done;
> }
> - if (!resized && !bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2)) {
> + if (!resized) {
The resized == true case will have a similar issue. Meaning the next
bpf_iter_udp_batch() will end up skipping the remaining sk in that bucket, e.g.
the partial-bucket batch has been consumed, so cur_sk == end_sk but
st_bucket_done == false and bpf_iter_udp_resume() returns NULL. It is sort of a
regression from the current "offset" implementation for this case. Any thought
on how to make it better?
> + err = bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2);
> + if (err)
> + return ERR_PTR(err);
> +
> resized = true;
> /* After allocating a larger batch, retry one more time to grab
> * the whole bucket.
Powered by blists - more mailing lists