lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bfey2fu3e74d52wjnoimu5ra7wqox2idnc2syzlrvsyjzezdli@lhywkrucesbf>
Date: Sat, 24 May 2025 14:09:41 -0700
From: Jordan Rife <jordan@...fe.io>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Kuniyuki Iwashima <kuniyu@...zon.com>, alexei.starovoitov@...il.com, 
	bpf@...r.kernel.org, daniel@...earbox.net, netdev@...r.kernel.org, 
	willemdebruijn.kernel@...il.com
Subject: Re: [PATCH v1 bpf-next 03/10] bpf: tcp: Get rid of st_bucket_done

On Fri, May 23, 2025 at 03:07:32PM -0700, Martin KaFai Lau wrote:
> On 5/22/25 1:42 PM, Kuniyuki Iwashima wrote:
> > From: Jordan Rife <jordan@...fe.io>
> > Date: Thu, 22 May 2025 11:16:13 -0700
> > > > > >   static void bpf_iter_tcp_put_batch(struct bpf_tcp_iter_state *iter)
> > > > > >   {
> > > > > > -	while (iter->cur_sk < iter->end_sk)
> > > > > > -		sock_gen_put(iter->batch[iter->cur_sk++]);
> > > > > > +	unsigned int cur_sk = iter->cur_sk;
> > > > > > +
> > > > > > +	while (cur_sk < iter->end_sk)
> > > > > > +		sock_gen_put(iter->batch[cur_sk++]);
> > > > > 
> > > > > Why is this chunk included in this patch ?
> > > > 
> > > > This should be in patch 5 to keep cur_sk for find_cookie
> > > 
> > > Without this, iter->cur_sk is mutated when iteration stops, and we lose
> > > our place. When iteration resumes and we call bpf_iter_tcp_batch the
> > > iter->cur_sk == iter->end_sk condition will always be true, so we will
> > > skip to the next bucket without seeking to the offset.
> > > 
> > > Before, we relied on st_bucket_done to tell us if we had remaining items
> > > in the current bucket to process but now need to preserve iter->cur_sk
> > > through iterations to make the behavior equivalent to what we had before.
> > 
> > Thanks for explanation, I was confused by calling tcp_seek_last_pos()
> > multiple times, and I think we need to preserve/restore st->offset too
> > in patch 2 and need this change.
> > 
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index ac00015d5e7a..0816f20bfdff 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -2791,6 +2791,7 @@ static void *tcp_seek_last_pos(struct seq_file *seq)
> >   			break;
> >   		st->bucket = 0;
> >   		st->state = TCP_SEQ_STATE_ESTABLISHED;
> > +		offset = 0;
> 
> This seems like an existing bug not necessarily related to this set.

Agree that this is more of an existing bug.

> The patch 5 has also removed the tcp_seek_last_pos() dependency, so I think
> it can be a standalone fix on its own.

With the tcp_seq_* ops there are also other corner cases that can lead
to skips, since they rely on st->offset to seek to the last position.

In the scenario described above, sockets disappearing from the last lhash
bucket leads to skipped sockets in the first ehash bucket, but you could
also have a scenario where, for example, the current lhash bucket has 6
sockets, iter->offset is currently 3, 3 sockets disappear from the start
of the current lhash bucket then tcp_seek_last_pos skips the remaining 3
sockets and goes to the next bucket.

I'm not sure it's worth fixing just this one case without also
overhauling the tcp_seq_* logic to prevent these other cases. Otherwise,
it seems more like a Band-aid fix. Perhaps a later series could explore
a more comprehensive solution there.

Jordan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ