[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f970f70-abea-4b79-bf62-852853b0137e@orange.com>
Date: Sun, 25 Aug 2024 14:24:51 +0200
From: alexandre.ferrieux@...nge.com
To: luoxuanqiang <luoxuanqiang@...inos.cn>, <edumazet@...gle.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <fw@...len.de>,
<kuba@...nel.org>, <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<kuniyu@...zon.com>
Subject: Re: [PATCH net v2] Fix race for duplicate reqsk on identical SYN
On 17/06/2024 16:44, Alexandre Ferrieux wrote:
> On 17/06/2024 04:53, luoxuanqiang wrote:
>>
>> 在 2024/6/17 07:45, alexandre.ferrieux@...nge.com 写道:
>>> On 14/06/2024 12:26, luoxuanqiang wrote:
>>>> When bonding is configured in BOND_MODE_BROADCAST mode, if two identical
>>>> SYN packets are received at the same time and processed on different CPUs,
>>>> it can potentially create the same sk (sock) but two different reqsk
>>>> (request_sock) in tcp_conn_request().
>>>>
>>>> These two different reqsk will respond with two SYNACK packets, and since
>>>> the generation of the seq (ISN) incorporates a timestamp, the final two
>>>> SYNACK packets will have different seq values.
>>>>
>>>> The consequence is that when the Client receives and replies with an ACK
>>>> to the earlier SYNACK packet, we will reset(RST) it.
>>>>
>>>> ========================================================================
>>> This is close, but not identical, to a race we observed on a *single* CPU with
>>> the TPROXY iptables target, in the following situation:
>>>
>>> - two identical SYNs, sent one second apart from the same client socket,
>>> arrive back-to-back on the interface (due to network jitter)
>>>
>>> - they happen to be handled in the same batch of packet from one softirq
>>> name_your_nic_poll()
>>>
>>> - there, two loops run sequentially: one for netfilter (doing TPROXY), one
>>> for the network stack (doing TCP processing)
>>>
>>> - the first generates two distinct contexts for the two SYNs
>>>
>>> - the second respects these contexts and never gets a chance to merge them
>>>
>>> The result is exactly as you describe, but in this case there is no need for
>>> bonding,
>>> and everything happens in one single CPU, which is pretty ironic for a race.
>>> My uneducated feeling is that the two loops are the cause of a simulated
>>> parallelism, yielding the race. If each packet of the batch was handled
>>> "to completion" (full netfilter handling followed immediately by full network
>>> stack ingestion), the problem would not exist.
>>
>> Based on your explanation, I believe a
>> similar issue can occur on a single CPU if two SYN packets are processed
>> closely enough. However, apart from using bond3 mode and having them
>> processed on different CPUs to facilitate reproducibility, I haven't
>> found a good way to replicate it.
>>
>> Could you please provide a more practical example or detailed test
>> steps to help me understand the reproduction scenario you mentioned?
>> Thank you very much!
>
> To reproduce in my case, I just need the two SYNs to arrive back-to-back in the
> ingress buffer and get in the same softirq run. To reach this goal easily, you
> can set the interrupt coalescence to a large value (like several milliseconds),
> and on the emitter side, send them in rapid sequence from userland. If that's
> not enough, you can just send one and duplicate it with TEE.
Good news: as I suspected, your fix (ff46e3b44219 shipped in 6.10) DOES solve my
problem too !
As a consequence, this means the single-CPU scenario was exposed too, through
netfilter's peculiar "breadth-first" iteration approach. This gives extra weight
to the importance of your work.
TL;DR: thanks, kudos, congrats, and thanks !!!
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
Powered by blists - more mailing lists