[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b700f6e-3ad7-a358-8dd3-c5120a115344@kylinos.cn>
Date: Mon, 17 Jun 2024 10:53:02 +0800
From: luoxuanqiang <luoxuanqiang@...inos.cn>
To: alexandre.ferrieux@...nge.com, edumazet@...gle.com
Cc: davem@...emloft.net, dsahern@...nel.org, fw@...len.de, kuba@...nel.org,
netdev@...r.kernel.org, pabeni@...hat.com, kuniyu@...zon.com
Subject: Re: [PATCH net v2] Fix race for duplicate reqsk on identical SYN
在 2024/6/17 07:45, alexandre.ferrieux@...nge.com 写道:
> On 14/06/2024 12:26, luoxuanqiang wrote:
>> When bonding is configured in BOND_MODE_BROADCAST mode, if two identical
>> SYN packets are received at the same time and processed on different
>> CPUs,
>> it can potentially create the same sk (sock) but two different reqsk
>> (request_sock) in tcp_conn_request().
>>
>> These two different reqsk will respond with two SYNACK packets, and
>> since
>> the generation of the seq (ISN) incorporates a timestamp, the final two
>> SYNACK packets will have different seq values.
>>
>> The consequence is that when the Client receives and replies with an ACK
>> to the earlier SYNACK packet, we will reset(RST) it.
>>
>> ========================================================================
> This is close, but not identical, to a race we observed on a *single*
> CPU with
> the TPROXY iptables target, in the following situation:
>
> - two identical SYNs, sent one second apart from the same client socket,
> arrive back-to-back on the interface (due to network jitter)
>
> - they happen to be handled in the same batch of packet from one softirq
> name_your_nic_poll()
>
> - there, two loops run sequentially: one for netfilter (doing
> TPROXY), one
> for the network stack (doing TCP processing)
>
> - the first generates two distinct contexts for the two SYNs
>
> - the second respects these contexts and never gets a chance to merge
> them
>
> The result is exactly as you describe, but in this case there is no
> need for bonding,
> and everything happens in one single CPU, which is pretty ironic for a
> race.
> My uneducated feeling is that the two loops are the cause of a simulated
> parallelism, yielding the race. If each packet of the batch was handled
> "to completion" (full netfilter handling followed immediately by full
> network
> stack ingestion), the problem would not exist.
Based on your explanation, I believe a
similar issue can occur on a single CPU if two SYN packets are processed
closely enough. However, apart from using bond3 mode and having them
processed on different CPUs to facilitate reproducibility, I haven't
found a good way to replicate it.
Could you please provide a more practical example or detailed test
steps to help me understand the reproduction scenario you mentioned?
Thank you very much!
Powered by blists - more mailing lists