netdev - Re: [PATCH net-next] tcp: correct the skip logic in tcp_sacktag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADVnQym_hhAoM2nvYyz2vR1zJTcAP0FzOZ-st0SfXE=6g68eVA@mail.gmail.com>
Date: Thu, 17 Jul 2025 10:14:17 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Xin Guo <guoxin0309@...il.com>
Cc: Paolo Abeni <pabeni@...hat.com>, Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org, 
	edumazet@...gle.com, davem@...emloft.net, dsahern@...nel.org, kuba@...nel.org
Subject: Re: [PATCH net-next] tcp: correct the skip logic in tcp_sacktag_skip()

On Thu, Jul 17, 2025 at 9:36 AM Xin Guo <guoxin0309@...il.com> wrote:
>
> Hi Paolo,
> Thanks for your review, let me explain in the thread first.
> 1)let me start from tcp_sacktag_skip, the definition as below:
> static struct sk_buff *tcp_sacktag_skip(struct sk_buff *skb, struct sock *sk,
> u32 skip_to_seq)
> {
> if (skb && after(TCP_SKB_CB(skb)->seq, skip_to_seq))
> return skb;
>
> return tcp_sacktag_bsearch(sk, skip_to_seq);
> }
> the input skb is a hint to avoid search the RTX tree, and the condition is:
> skb->seq > skip_to_seq(so skip_to_seq cannot be included in skb),
> as below:
> 0----------------------------------------------|------------------>+ skb->seq
> 0--------------------|-------------------------------------------->+ skip_to_seq
>
> 2)let me check the code snippet in tcp_sacktag_write_queue()
> the code try to speed up the search by using tcp_highest_sack(),
> the code is from the rtx queue is a list, but now the rtx queue is a tree.
> the mean is that if the start_seq >=tcp_highest_sack_seq(), the we use
> skb=tcp_highest_sack() as a hint to speed up the lookup(avoid to
> lookup the tree).
> so we can see that the skb->seq <=start_seq.
> then if we use the skb and start_seq to call tcp_sacktag_skip(),
> the tcp_sacktag_skip will go for rtx tree lookup, so
> code snippet does not take effect.
>
> static int
> tcp_sacktag_write_queue(struct sock *sk, const struct sk_buff *ack_skb,
> u32 prior_snd_una, struct tcp_sacktag_state *state)
> {
> ...
> while (i < used_sacks) {
>
> if (!before(start_seq, tcp_highest_sack_seq(tp))) {
> skb = tcp_highest_sack(sk);
> if (!skb)
> break;
> }
> skb = tcp_sacktag_skip(skb, sk, start_seq);
>
> walk:
> skb = tcp_sacktag_walk(skb, sk, next_dup, state,
> start_seq, end_seq, dup_sack);
>
> advance_sp:
> i++;
> }
>
> 3) on the other side, let me show the logic in tcp_sacktag_bsearch, the logic is
>    the skb->seq should be met:
>    seq=<skb->seq and seq<skb->end_seq
> so the seq should be included in skb, the log is not consist with
> tcp_sacktag_skip().
>
> static struct sk_buff *tcp_sacktag_bsearch(struct sock *sk, u32 seq)
> {
> struct rb_node *parent, **p = &sk->tcp_rtx_queue.rb_node;
> struct sk_buff *skb;
>
> while (*p) {
> parent = *p;
> skb = rb_to_skb(parent);
> if (before(seq, TCP_SKB_CB(skb)->seq)) {
> p = &parent->rb_left;
> continue;
> }
> //[Xin Guo] at here seq=<skb->seq
> if (!before(seq, TCP_SKB_CB(skb)->end_seq)) {
> p = &parent->rb_right;
> continue;
> }
> //[Xin Guo]at here seq<skb->end_seq
> return skb;
> }
> return NULL;
> }
>
> i hope that it is more clear now, thanks.
>
> Regards
> Guo Xin.
>
> On Thu, Jul 17, 2025 at 4:24 PM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > On 7/13/25 5:22 PM, Xin Guo wrote:
> > > tcp_sacktag_skip() directly return the input skb only
> > > if TCP_SKB_CB(skb)->seq>skip_to_seq,
> > > this is not right, and  the logic should be
> > > TCP_SKB_CB(skb)->seq>=skip_to_seq,
> >
> > Adding Kuniyuki
> >
> > I'm not sure this statement is actually true. A more clear (and slightly
> > more descriptive) commit message could help better understanding the
> > issue. What is the bad behaviour you are observing?
> >
> > Ideally a packetdrill test case to demonstrate it would help
> sorry that a packetdrill script cannot show the wrong behavior.

I agree with Paolo that having a packetdrill test case for this kind
of issue would be best.

Can you please explain why you are saying that a packetdrill script
cannot show the incorrect behavior for this issue?

Usually packetdrill is a good fit for these kinds of protocol
processing issues that do not involve performance or race conditions.

Here are examples of test cases that stress SACK processing behavior,
if that helps:
  https://github.com/google/packetdrill/tree/master/gtests/net/tcp/sack

Thanks,
neal