lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fa753eac-3dd4-40d0-861e-3768d2ec2ddd@redhat.com>
Date: Tue, 23 Sep 2025 09:45:11 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Jakub Sitnicki <jakub@...udflare.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Kuniyuki Iwashima <kuniyu@...gle.com>, Neal Cardwell <ncardwell@...gle.com>,
 kernel-team@...udflare.com, Lee Valentine <lvalentine@...udflare.com>
Subject: Re: [PATCH net-next v4 1/2] tcp: Update bind bucket state on port
 release

Hi,

I'm sorry for the latency, I got lost in pending threads.

On 9/16/25 3:14 PM, Jakub Sitnicki wrote:
> On Tue, Sep 16, 2025 at 12:14 PM +02, Paolo Abeni wrote:
>> On 9/13/25 12:09 PM, Jakub Sitnicki wrote:
>>> Today, once an inet_bind_bucket enters a state where fastreuse >= 0 or
>>> fastreuseport >= 0 after a socket is explicitly bound to a port, it remains
>>> in that state until all sockets are removed and the bucket is destroyed.
>>>
>>> In this state, the bucket is skipped during ephemeral port selection in
>>> connect(). For applications using a reduced ephemeral port
>>> range (IP_LOCAL_PORT_RANGE socket option), this can cause faster port
>>> exhaustion since blocked buckets are excluded from reuse.
>>>
>>> The reason the bucket state isn't updated on port release is unclear.
>>> Possibly a performance trade-off to avoid scanning bucket owners, or just
>>> an oversight.
>>>
>>> Fix it by recalculating the bucket state when a socket releases a port. To
>>> limit overhead, each inet_bind2_bucket stores its own (fastreuse,
>>> fastreuseport) state. On port release, only the relevant port-addr bucket
>>> is scanned, and the overall state is derived from these.
>>
>> I'm possibly likely lost, but I think that the bucket state could change
>> even after inet_bhash2_update_saddr(), but AFAICS it's not updated there.
> 
> Let me double check if I understand what you have in mind because now I
> also feel a bit lost :-)
> 
> We already update the bucket state in inet_bhash2_update_saddr(). I
> assume we are talking about the main body, not the early bailout path
> when the socket is not bound yet [1].
> 
> This code gets called only in the obscure (?) case when ip_dynaddr [2]
> sysctl is set, and we have a routing failure during connection setup
> phase (SYN-SENT).
> 
> In such case, on source address update, call to
> inet_bind2_bucket_destroy() will recalculate port-addr bucket state,
> potentially "downgrading" it to (fastreuse=-1, fastreuseport=-1).
> 
> But if the "downgrade" happens, it changes nothing for the port bucket
> state, as we are about to re-add the socket into another port-addr
> bucket.

This was indeed the path I was looking for. I lost track of the fact
that the port bucket affected by the removed and add is the same, so
it's state does not change.

It clear now that you pointed that out, thanks!

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ