[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmhil74m10c.mognet@vschneid.remote.csb>
Date: Wed, 18 Oct 2023 16:57:39 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: dccp@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
David Ahern <dsahern@...nel.org>,
Juri Lelli <juri.lelli@...hat.com>,
Tomas Glozar <tglozar@...hat.com>
Subject: Re: [RFC PATCH] tcp/dcpp: Un-pin tw_timer
On 16/10/23 17:40, Eric Dumazet wrote:
> On Mon, Oct 16, 2023 at 3:00 PM Valentin Schneider <vschneid@...hat.com> wrote:
>>
>> The TCP timewait timer is proving to be problematic for setups where scheduler
>> CPU isolation is achieved at runtime via cpusets (as opposed to statically via
>> isolcpus=domains).
>>
>> What happens there is a CPU goes through tcp_time_wait(), arming the time_wait
>> timer, then gets isolated. TCP_TIMEWAIT_LEN later, the timer fires, causing
>> interference for the now-isolated CPU. This is conceptually similar to the issue
>> described in
>> e02b93124855 ("workqueue: Unbind kworkers before sending them to exit()")
>>
>> Making the timer un-pinned would resolve this, as it would be queued onto
>> HK_FLAG_TIMER CPUs. It would Unfortunately go against
>> ed2e92394589 ("tcp/dccp: fix timewait races in timer handling")
>> as we'd need to arm the timer after the *hashdance() to not have it fire before
>> we've finished setting up the timewait_socket.
>>
>> However, looking into this, I cannot grok what race is fixed by having the timer
>> *armed* before the hashdance.
>
> That was because :
>
> 1) the timer could expire before we had a chance to set refcnt to
> a non zero value. I guess this is fine if we use an extra atomic decrement.
>
> OR
>
> 2) another cpu could find the TW and delete it (trying to cancel the
> tw_timer) before
> we could arm the timer. ( inet_twsk_deschedule_put() is using
> del_timer_sync() followed by inet_twsk_kill())
>
> Thus the tw timer would be armed for 60 seconds, then we would have to
> wait for the timer to really
> get rid of the tw structure.
>
> I think you also need to change inet_twsk_deschedule_put() logic ?
>
Gotcha, thank you for pointing it out.
>> Keep softirqs disabled, but make the timer un-pinned and arm it after the
>> hashdance. Remote CPUs may start using the timewait socket before the timer is
>> armed, but their execution of __inet_lookup_established() won't prevent the
>> arming of the timer.
>
> OK, I guess we can live with the following race :
>
> CPU0
>
> allocates a tw, insert it in hash table
>
> CPU1: finds the TW and removes it (timer
> cancel does nothing)
>
> CPU0
> arms a TW timer, lasting
>
Looks reasonable to me, I'll go write v2.
Thanks for the help!
Powered by blists - more mailing lists