[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLCrjP6wd30mt2Ptjr4=bPT3jzZSxsK+DmMysakvAvv1A@mail.gmail.com>
Date: Tue, 9 Apr 2024 12:11:12 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Neal Cardwell <ncardwell@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 2/2] tcp: replace TCP_SKB_CB(skb)->tcp_tw_isn
with a per-cpu field
On Tue, Apr 9, 2024 at 11:58 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On Sun, 2024-04-07 at 09:33 +0000, Eric Dumazet wrote:
> > TCP can transform a TIMEWAIT socket into a SYN_RECV one from
> > a SYN packet, and the ISN of the SYNACK packet is normally
> > generated using TIMEWAIT tw_snd_nxt :
> >
> > tcp_timewait_state_process()
> > ...
> > u32 isn = tcptw->tw_snd_nxt + 65535 + 2;
> > if (isn == 0)
> > isn++;
> > TCP_SKB_CB(skb)->tcp_tw_isn = isn;
> > return TCP_TW_SYN;
> >
> > This SYN packet also bypasses normal checks against listen queue
> > being full or not.
> >
> > tcp_conn_request()
> > ...
> > __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn;
> > ...
> > /* TW buckets are converted to open requests without
> > * limitations, they conserve resources and peer is
> > * evidently real one.
> > */
> > if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) {
> > want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
> > if (!want_cookie)
> > goto drop;
> > }
> >
> > This was using TCP_SKB_CB(skb)->tcp_tw_isn field in skb.
> >
> > Unfortunately this field has been accidentally cleared
> > after the call to tcp_timewait_state_process() returning
> > TCP_TW_SYN.
> >
> > Using a field in TCP_SKB_CB(skb) for a temporary state
> > is overkill.
> >
> > Switch instead to a per-cpu variable.
>
> I guess that pushing the info via a local variable all the way down to
> tcp_conn_request would be cumbersome, and will prevent the fast path
> optimization, right?
Right, I tried two other variants of the fix before coming to this.
One of the issues I had is that packets eventually going through the
socket backlog
can trigger this path, so I would have to carry a local variable in a
lot of paths.
References :
commit 449809a66c1d0b1563dee84493e14bf3104d2d7e tcp/dccp: block BH
for SYN processing
commit 1ad98e9d1bdf4724c0a8532fabd84bf3c457c2bc tcp/dccp: fix
lockdep issue when SYN is backlogged
I chose to not care about this tricky case (vs tcp_tw_syn), by making sure
to always leave per-cpu variable cleared after each tcp_conn_request()
This was also a nice way of not having to clear a field/variable for
each incoming
TCP packet :)
>
> > As a bonus, we do not have to clear tcp_tw_isn in TCP receive
> > fast path.
> > It is temporarily set then cleared only in the TCP_TW_SYN dance.
> >
> > Fixes: 4ad19de8774e ("net: tcp6: fix double call of tcp_v6_fill_cb()")
> > Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()")
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>
> [...]
>
> > @@ -2397,6 +2397,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > sk = sk2;
> > tcp_v4_restore_cb(skb);
> > refcounted = false;
> > + __this_cpu_write(tcp_tw_isn, isn);
> > goto process;
>
> Unrelated from this patch, but I think the 'process' label could be
> moved down skipping a couple of conditionals. 'sk' is a listener
> socket, checking for TW or SYN_RECV should not be needed, right?
Absolutely !
Powered by blists - more mailing lists