lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLCrjP6wd30mt2Ptjr4=bPT3jzZSxsK+DmMysakvAvv1A@mail.gmail.com>
Date: Tue, 9 Apr 2024 12:11:12 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Neal Cardwell <ncardwell@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 2/2] tcp: replace TCP_SKB_CB(skb)->tcp_tw_isn
 with a per-cpu field

On Tue, Apr 9, 2024 at 11:58 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On Sun, 2024-04-07 at 09:33 +0000, Eric Dumazet wrote:
> > TCP can transform a TIMEWAIT socket into a SYN_RECV one from
> > a SYN packet, and the ISN of the SYNACK packet is normally
> > generated using TIMEWAIT tw_snd_nxt :
> >
> > tcp_timewait_state_process()
> > ...
> >     u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
> >     if (isn == 0)
> >         isn++;
> >     TCP_SKB_CB(skb)->tcp_tw_isn = isn;
> >     return TCP_TW_SYN;
> >
> > This SYN packet also bypasses normal checks against listen queue
> > being full or not.
> >
> > tcp_conn_request()
> > ...
> >        __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
> > ...
> >         /* TW buckets are converted to open requests without
> >          * limitations, they conserve resources and peer is
> >          * evidently real one.
> >          */
> >         if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) {
> >                 want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
> >                 if (!want_cookie)
> >                         goto drop;
> >         }
> >
> > This was using TCP_SKB_CB(skb)->tcp_tw_isn field in skb.
> >
> > Unfortunately this field has been accidentally cleared
> > after the call to tcp_timewait_state_process() returning
> > TCP_TW_SYN.
> >
> > Using a field in TCP_SKB_CB(skb) for a temporary state
> > is overkill.
> >
> > Switch instead to a per-cpu variable.
>
> I guess that pushing the info via a local variable all the way down to
> tcp_conn_request would be cumbersome, and will prevent the fast path
> optimization, right?

Right, I tried two other variants of the fix before coming to this.

One of the issues I had is that packets eventually going through the
socket backlog
can trigger this path, so I would have to carry a local variable in a
lot of paths.

References :

commit 449809a66c1d0b1563dee84493e14bf3104d2d7e    tcp/dccp: block BH
for SYN processing
commit 1ad98e9d1bdf4724c0a8532fabd84bf3c457c2bc    tcp/dccp: fix
lockdep issue when SYN is backlogged

I chose to not care about this tricky case (vs tcp_tw_syn), by making sure
to always leave per-cpu variable cleared after each tcp_conn_request()

This was also a nice way of not having to clear a field/variable for
each incoming
TCP packet :)

>
> > As a bonus, we do not have to clear tcp_tw_isn in TCP receive
> > fast path.
> > It is temporarily set then cleared only in the TCP_TW_SYN dance.
> >
> > Fixes: 4ad19de8774e ("net: tcp6: fix double call of tcp_v6_fill_cb()")
> > Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()")
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>
> [...]
>
> > @@ -2397,6 +2397,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >                       sk = sk2;
> >                       tcp_v4_restore_cb(skb);
> >                       refcounted = false;
> > +                     __this_cpu_write(tcp_tw_isn, isn);
> >                       goto process;
>
> Unrelated from this patch, but I think the 'process' label could be
> moved down skipping a couple of conditionals. 'sk' is a listener
> socket, checking for TW or SYN_RECV should not be needed, right?

Absolutely !

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ