[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <35f497afd90fe16ba1408f25ea1ff62af6a73a90.camel@redhat.com>
Date: Tue, 25 Jun 2024 11:49:51 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: luoxuanqiang <luoxuanqiang@...inos.cn>, kuniyu@...zon.com,
edumazet@...gle.com, kuba@...nel.org, davem@...emloft.net
Cc: dccp@...r.kernel.org, dsahern@...nel.org, fw@...len.de,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
alexandre.ferrieux@...nge.com
Subject: Re: [PATCH net v4] Fix race for duplicate reqsk on identical SYN
On Fri, 2024-06-21 at 09:39 +0800, luoxuanqiang wrote:
> When bonding is configured in BOND_MODE_BROADCAST mode, if two identical
> SYN packets are received at the same time and processed on different CPUs,
> it can potentially create the same sk (sock) but two different reqsk
> (request_sock) in tcp_conn_request().
>
> These two different reqsk will respond with two SYNACK packets, and since
> the generation of the seq (ISN) incorporates a timestamp, the final two
> SYNACK packets will have different seq values.
>
> The consequence is that when the Client receives and replies with an ACK
> to the earlier SYNACK packet, we will reset(RST) it.
>
> ========================================================================
>
> This behavior is consistently reproducible in my local setup,
> which comprises:
>
> | NETA1 ------ NETB1 |
> PC_A --- bond --- | | --- bond --- PC_B
> | NETA2 ------ NETB2 |
>
> - PC_A is the Server and has two network cards, NETA1 and NETA2. I have
> bonded these two cards using BOND_MODE_BROADCAST mode and configured
> them to be handled by different CPU.
>
> - PC_B is the Client, also equipped with two network cards, NETB1 and
> NETB2, which are also bonded and configured in BOND_MODE_BROADCAST mode.
>
> If the client attempts a TCP connection to the server, it might encounter
> a failure. Capturing packets from the server side reveals:
>
> 10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
> 10.10.10.10.45182 > localhost: Flags [S], seq 320236027,
> localhost > 10.10.10.10.45182: Flags [S.], seq 2967855116,
> localhost > 10.10.10.10.45182: Flags [S.], seq 2967855123, <==
> 10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
> 10.10.10.10.45182 > localhost: Flags [.], ack 4294967290,
> localhost > 10.10.10.10.45182: Flags [R], seq 2967855117, <==
> localhost > 10.10.10.10.45182: Flags [R], seq 2967855117,
>
> Two SYNACKs with different seq numbers are sent by localhost,
> resulting in an anomaly.
>
> ========================================================================
>
> The attempted solution is as follows:
> Add a return value to inet_csk_reqsk_queue_hash_add() to confirm if the
> ehash insertion is successful (Up to now, the reason for unsuccessful
> insertion is that a reqsk for the same connection has already been
> inserted). If the insertion fails, release the reqsk.
>
> Due to the refcnt, Kuniyuki suggests also adding a return value check
> for the DCCP module; if ehash insertion fails, indicating a successful
> insertion of the same connection, simply release the reqsk as well.
>
> Simultaneously, In the reqsk_queue_hash_req(), the start of the
> req->rsk_timer is adjusted to be after successful insertion.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Just after applying the patch I wondered if the issue addressed here
should be observable only after commit e994b2f0fb92 ("tcp: do not lock
listener to process SYN packets")?
In practice it should not matter as the latter commit it's older than
the currently older LST, but I'm wondering if I read the things
correctly?
Thanks!
Paolo
Powered by blists - more mailing lists