[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iLEX4Np+i-QWh_amkDR7jcO2XWyLVY_edXnA4y+Y7JX4w@mail.gmail.com>
Date: Thu, 15 Jun 2023 17:25:42 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Duan Muquan <duanmuquan@...du.com>
Cc: davem@...emloft.net, dsahern@...nel.org, kuba@...nel.org,
pabeni@...hat.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] tcp: fix connection reset due to tw hashdance race.
On Thu, Jun 15, 2023 at 2:13 PM Duan Muquan <duanmuquan@...du.com> wrote:
>
> If the FIN from passive closer and the ACK for active closer's FIN are
> processed on different CPUs concurrently, tw hashdance race may occur.
> On loopback interface, transmit function queues a skb to current CPU's
> softnet's input queue by default. Suppose active closer runs on CPU 0,
> and passive closer runs on CPU 1. If the ACK for the active closer's
> FIN is sent with no delay, it will be processed and tw hashdance will
> be done on CPU 0; The passive closer's FIN will be sent in another
> segment and processed on CPU 1, it may fail to find tw sock in the
> ehash table due to tw hashdance on CPU 0, then get a RESET.
> If application reconnects immediately with the same source port, it
> will get reset because tw sock's tw_substate is still TCP_FIN_WAIT2.
>
> The dmesg to trace down this issue:
>
> .333516] tcp_send_fin: sk 0000000092105ad2 cookie 9 cpu 3
> .333524] rcv_state_process:FIN_WAIT2 sk 0000000092105ad2 cookie 9 cpu 3
> .333534] tcp_close: tcp_time_wait: sk 0000000092105ad2 cookie 9 cpu 3
> .333538] hashdance: tw 00000000690fdb7a added to ehash cookie 9 cpu 3
> .333541] hashdance: sk 0000000092105ad2 removed cookie 9 cpu 3
> .333544] __inet_lookup_established: Failed the refcount check:
> !refcount_inc_not_zero 00000000690fdb7a ref 0 cookie 9 cpu 0
> .333549] hashdance: tw 00000000690fdb7a before add ref 0 cookie 9 cpu 3
> .333552] rcv_state: RST for FIN listen 000000003c50afa6 cookie 0 cpu 0
> .333574] tcp_send_fin: sk 0000000066757bf8 ref 2 cookie 0 cpu 0
> .333611] timewait_state: TCP_TW_RST tw 00000000690fdb7a cookie 9 cpu 0
> .333626] tcp_connect: sk 0000000066757bf8 cpu 0 cookie 0
>
> Here is the call trace map:
>
NACK
Powered by blists - more mailing lists