[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <B865DECA-2350-471F-B75B-A9D2F33672CA@isovalent.com>
Date: Fri, 12 Jan 2024 15:57:55 -0800
From: Aditi Ghag <aditi.ghag@...valent.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: bpf@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
netdev@...r.kernel.org,
kernel-team@...a.com,
Yonghong Song <yonghong.song@...ux.dev>
Subject: Re: [PATCH v3 bpf 2/3] bpf: Avoid iter->offset making backward
progress in bpf_iter_udp
> On Jan 12, 2024, at 11:05 AM, Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> From: Martin KaFai Lau <martin.lau@...nel.org>
>
> There is a bug in the bpf_iter_udp_batch() function that stops
> the userspace from making forward progress.
>
> The case that triggers the bug is the userspace passed in
> a very small read buffer. When the bpf prog does bpf_seq_printf,
> the userspace read buffer is not enough to capture the whole bucket.
>
> When the read buffer is not large enough, the kernel will remember
> the offset of the bucket in iter->offset such that the next userspace
> read() can continue from where it left off.
>
> The kernel will skip the number (== "iter->offset") of sockets in
> the next read(). However, the code directly decrements the
> "--iter->offset". This is incorrect because the next read() may
> not consume the whole bucket either and then the next-next read()
> will start from offset 0. The net effect is the userspace will
> keep reading from the beginning of a bucket and the process will
> never finish. "iter->offset" must always go forward until the
> whole bucket is consumed.
>
> This patch fixes it by using a local variable "resume_offset"
> and "resume_bucket". "iter->offset" is always reset to 0 before
> it may be used. "iter->offset" will be advanced to the
> "resume_offset" when it continues from the "resume_bucket" (i.e.
> "state->bucket == resume_bucket"). This brings it closer to
> the bpf_iter_tcp's offset handling which does not suffer
> the same bug.
>
> Cc: Aditi Ghag <aditi.ghag@...valent.com>
> Fixes: c96dac8d369f ("bpf: udp: Implement batching for sockets iterator")
> Acked-by: Yonghong Song <yonghong.song@...ux.dev>
> Signed-off-by: Martin KaFai Lau <martin.lau@...nel.org>
Reviewed-by: Aditi Ghag <aditi.ghag@...valent.com>
Thanks!
> ---
> net/ipv4/udp.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 978b83d3c094..04c534a9ef89 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -3137,16 +3137,18 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
> struct bpf_udp_iter_state *iter = seq->private;
> struct udp_iter_state *state = &iter->state;
> struct net *net = seq_file_net(seq);
> + int resume_bucket, resume_offset;
> struct udp_table *udptable;
> unsigned int batch_sks = 0;
> bool resized = false;
> struct sock *sk;
>
> + resume_bucket = state->bucket;
> + resume_offset = iter->offset;
> +
> /* The current batch is done, so advance the bucket. */
> - if (iter->st_bucket_done) {
> + if (iter->st_bucket_done)
> state->bucket++;
> - iter->offset = 0;
> - }
>
> udptable = udp_get_table_seq(seq, net);
>
> @@ -3166,19 +3168,19 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
> for (; state->bucket <= udptable->mask; state->bucket++) {
> struct udp_hslot *hslot2 = &udptable->hash2[state->bucket];
>
> - if (hlist_empty(&hslot2->head)) {
> - iter->offset = 0;
> + if (hlist_empty(&hslot2->head))
> continue;
> - }
>
> + iter->offset = 0;
> spin_lock_bh(&hslot2->lock);
> udp_portaddr_for_each_entry(sk, &hslot2->head) {
> if (seq_sk_match(seq, sk)) {
> /* Resume from the last iterated socket at the
> * offset in the bucket before iterator was stopped.
> */
> - if (iter->offset) {
> - --iter->offset;
> + if (state->bucket == resume_bucket &&
> + iter->offset < resume_offset) {
I like this invariant of ensuring that the batching and resume operations are performed for the same bucket under consideration.
> + ++iter->offset;
> continue;
> }
> if (iter->end_sk < iter->max_sk) {
> @@ -3192,9 +3194,6 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>
> if (iter->end_sk)
> break;
> -
> - /* Reset the current bucket's offset before moving to the next bucket. */
> - iter->offset = 0;
> }
>
> /* All done: no batch made. */
> --
> 2.34.1
>
Powered by blists - more mailing lists