[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmh8r64bvjs.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Fri, 08 Dec 2023 15:47:35 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: dccp@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org, "David S.
Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Paolo
Abeni <pabeni@...hat.com>, David Ahern <dsahern@...nel.org>, Juri Lelli
<juri.lelli@...hat.com>, Tomas Glozar <tglozar@...hat.com>, Sebastian
Andrzej Siewior <bigeasy@...utronix.de>, Thomas Gleixner
<tglx@...utronix.de>
Subject: Re: [PATCH v2 1/2] tcp/dcpp: Un-pin tw_timer
On 23/11/23 17:32, Eric Dumazet wrote:
> Again, I think you missed some details.
>
> I am OOO for a few days, I do not have time to elaborate.
>
> You will need to properly track active timer by elevating
> tw->tw_refcnt, or I guarantee something wrong will happen.
I apologize if I'm being thick skulled, I've been staring at the code and
tracing on live systems and I still can't see the issue with refcounts.
The patch has the hashdance set the refcount to 4:
* - one reference for bhash chain.
* - one reference for ehash chain.
* - one reference for timer.
* - one reference for ourself (our caller will release it).
AFAICT, finding & using the socket via the ehash table comes with a refcount_inc
(e.g. __inet_lookup_established()).
Worst case, the lookup happens after programming the timer, and we get a
inet_twsk_deschedule_put(). This reduces the refcount by:
3 via inet_twsk_kill():
1 (sk_nulls_del_node_init_rcu())
1 (inet_twsk_bind_unhash())
1 (inet_twsk_put())
1 via inet_twsk_put()
IOW 4 total. So we can have:
tcp_time_wait()
inet_twsk_hashdance(); // refcount = 4
inet_twsk_schedule(); // timer armed
tcp_v4_rcv()
sk = __inet_lookup_skb(); // refcount = 5 (+1)
inet_twsk_deschedule_put(inet_twsk(sk));
inet_twsk_kill(tw) // refcount = 2 (-3)
inet_twsk_put(tw) // refcount = 1 (-1)
inet_twsk_put(tw) // refcount = 0 (-1)
__inet_hash_connect() can invoke inet_twsk_bind_unhash() by itself before
calling inet_twsk_deschedule_put(), but that just means it won't be done by
the latter, so the total refcount delta remains the same.
Thinking about it differently, this would mean that currently (without the
patch) another CPU can bring the refcount to 0 without disarming the timer,
because the patch is making the initial value one higher.
What am I missing?
Powered by blists - more mailing lists