lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJ5kHmksR=nGSMVjacuV0uqu5Hs0g1s343gvAM9Yf=+Bg@mail.gmail.com>
Date: Thu, 8 Jun 2023 13:54:09 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: "Duan,Muquan" <duanmuquan@...du.com>
Cc: "davem@...emloft.net" <davem@...emloft.net>, "dsahern@...nel.org" <dsahern@...nel.org>, 
	"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, 
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tcp: fix connection reset due to tw hashdance race.

On Thu, Jun 8, 2023 at 1:24 PM Duan,Muquan <duanmuquan@...du.com> wrote:
>
> Besides trying to find the right tw sock, another idea is that if FIN segment finds listener sock, just discard the segment, because this is obvious a bad case, and the peer will retransmit it. Or for FIN segment we only look up in the established hash table, if not found then discard it.
>

Sure, please give the RFC number and section number that discusses
this point, and then we might consider this.

Just another reminder about TW : timewait sockets are "best effort".

Their allocation can fail, and /proc/sys/net/ipv4/tcp_max_tw_buckets
can control their number to 0

Applications must be able to recover gracefully if a 4-tuple is reused too fast.

>
> 2023年6月8日 下午12:13,Eric Dumazet <edumazet@...gle.com> 写道:
>
> On Thu, Jun 8, 2023 at 5:59 AM Duan,Muquan <duanmuquan@...du.com> wrote:
>
>
> Hi, Eric,
>
> Thanks a lot for your explanation!
>
> Even if we add reader lock,  if set the refcnt outside spin_lock()/spin_unlock(), during the interval between spin_unlock() and refcnt_set(),  other cpus will see the tw sock with refcont 0, and validation for refcnt will fail.
>
> A suggestion, before the tw sock is added into ehash table, it has been already used by tw timer and bhash chain, we can firstly add refcnt to 2 before adding two to ehash table,. or add the refcnt one by one for timer, bhash and ehash. This  can avoid the refcont validation failure on other cpus.
>
> This can reduce the frequency of the connection reset issue from 20 min to 180 min for our product,  We may wait quite a long time before the best solution is ready, if this obvious defect is fixed, userland applications can benefit from it.
>
> Looking forward to your opinions!
>
>
> Again, my opinion is that we need a proper fix, not work arounds.
>
> I will work on this a bit later.
>
> In the meantime you can apply locally your patch if you feel this is
> what you want.
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ