lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d15f3a7-b7bc-1a45-0bdf-a0ccc311f576@iogearbox.net>
Date: Wed, 20 Dec 2023 15:24:37 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Martin KaFai Lau <martin.lau@...ux.dev>, bpf@...r.kernel.org
Cc: 'Alexei Starovoitov ' <ast@...nel.org>,
 'Andrii Nakryiko ' <andrii@...nel.org>, netdev@...r.kernel.org,
 kernel-team@...a.com, Aditi Ghag <aditi.ghag@...valent.com>
Subject: Re: [PATCH bpf 1/2] bpf: Avoid iter->offset making backward progress
 in bpf_iter_udp

On 12/19/23 8:32 PM, Martin KaFai Lau wrote:
> From: Martin KaFai Lau <martin.lau@...nel.org>
> 
> The bpf_iter_udp iterates all udp_sk by iterating the udp_table.
> The bpf_iter_udp stores all udp_sk of a bucket while iterating
> the udp_table. The term used in the kernel code is "batch" the
> whole bucket. The reason for batching is to allow lock_sock() on
> each socket before calling the bpf prog such that the bpf prog can
> safely call helper/kfunc that changes the sk's state,
> e.g. bpf_setsockopt.
> 
> There is a bug in the bpf_iter_udp_batch() function that stops
> the userspace from making forward progress.
> 
> The case that triggers the bug is the userspace passed in
> a very small read buffer. When the bpf prog does bpf_seq_printf,
> the userspace read buffer is not enough to capture the whole "batch".
> 
> When the read buffer is not enough for the whole "batch", the kernel
> will remember the offset of the batch in iter->offset such that
> the next userspace read() can continue from where it left off.
> 
> The kernel will skip the number (== "iter->offset") of sockets in
> the next read(). However, the code directly decrements the
> "--iter->offset". This is incorrect because the next read() may
> not consume the whole "batch" either and the next next read() will
> start from offset 0.
> 
> Doing "--iter->offset" is essentially making backward progress.
> The net effect is the userspace will keep reading from the beginning
> of a bucket and the process will never finish. "iter->offset" must always
> go forward until the whole "batch" (or bucket) is consumed by the
> userspace.
> 
> This patch fixes it by doing the decrement in a local stack
> variable.

nit: Probably makes sense to also state here that bpf_iter_tcp does
not have this issue, so it's clear from commit message that you did
also audit the TCP one.

> Cc: Aditi Ghag <aditi.ghag@...valent.com>
> Fixes: c96dac8d369f ("bpf: udp: Implement batching for sockets iterator")
> Signed-off-by: Martin KaFai Lau <martin.lau@...nel.org>
> ---
>   net/ipv4/udp.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 89e5a806b82e..6cf4151c2eb4 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -3141,6 +3141,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>   	unsigned int batch_sks = 0;
>   	bool resized = false;
>   	struct sock *sk;
> +	int offset;
>   
>   	/* The current batch is done, so advance the bucket. */
>   	if (iter->st_bucket_done) {
> @@ -3162,6 +3163,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>   	iter->end_sk = 0;
>   	iter->st_bucket_done = false;
>   	batch_sks = 0;
> +	offset = iter->offset;
>   
>   	for (; state->bucket <= udptable->mask; state->bucket++) {
>   		struct udp_hslot *hslot2 = &udptable->hash2[state->bucket];
> @@ -3177,8 +3179,8 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>   				/* Resume from the last iterated socket at the
>   				 * offset in the bucket before iterator was stopped.
>   				 */
> -				if (iter->offset) {
> -					--iter->offset;
> +				if (offset) {
> +					--offset;
>   					continue;
>   				}
>   				if (iter->end_sk < iter->max_sk) {
> 

Do we also need something like :

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6cf4151c2eb4..ef4fc436253d 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -3169,7 +3169,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
                 struct udp_hslot *hslot2 = &udptable->hash2[state->bucket];

                 if (hlist_empty(&hslot2->head)) {
-                       iter->offset = 0;
+                       iter->offset = offset = 0;
                         continue;
                 }

@@ -3196,7 +3196,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
                         break;

                 /* Reset the current bucket's offset before moving to the next bucket. */
-               iter->offset = 0;
+               iter->offset = offset = 0;
         }

         /* All done: no batch made. */

For the case when upon retry the current batch is done earlier than expected ?

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ