[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94d400d8-cb71-9f3a-32ad-a2492c1a5bd8@163.com>
Date: Fri, 22 Mar 2024 11:16:09 +0800
From: Jianguo Wu <wujianguo106@....com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: davem@...emloft.net, edumazet@...gle.com, netdev@...r.kernel.org
Subject: Re: [PATCH] tcp: Fix inet_bind2_bucket_match_addr_any() regression
Hi Kuniyuki,
Thanks for your reply!
On 2024/3/21 12:55, Kuniyuki Iwashima wrote:
> Hi,
>
> Thanks for the patch.
>
> From: Jianguo Wu <wujianguo106@....com>
> Date: Thu, 21 Mar 2024 11:02:36 +0800
>> From: Jianguo Wu <wujianguo@...natelecom.cn>
>>
>> If we bind() a TCPv4 socket to 0.0.0.0:8090, then bind() a TCPv6(ipv6only) socket
>
> Please wrap each line at <75 characters except for logs/output like below.
>
OK.
>
>> to :::8090, both without SO_REUSEPORT, then bind() 127.0.0.1:8090, it should fail
>
> [::]:8090 is easier to read and the recommended way.
> https://datatracker.ietf.org/doc/html/rfc5952#section-6
>
> But please keep the netstat output as is.
>
>> but now succeeds. like this:
>> tcp 0 0 127.0.0.1:8090 0.0.0.0:* LISTEN
>> tcp 0 0 0.0.0.0:8090 0.0.0.0:* LISTEN
>> tcp6 0 0 :::8090 :::* LISTEN
>>
>> bind() 0.0.0.0:8090, :::8090 and ::1:8090 are all fail.
>
> What do you mean by all fail ?
> At least, [::1]:8090 would fail with the current code in this case.
In my test, 127.0.0.1:8090 succeeds, 0.0.0.0:8090, [::]:8090 and [::1]:8090 are all fail
>
>
>> But if we bind() a TCPv6(ipv6only) socket to :::8090 first, then bind() a TCPv4
>> socket to 0.0.0.0:8090, then bind() 127.0.0.1:8090, 0.0.0.0:8090, :::8090 and ::1:8090 are all fail.
>>
>> When bind() 127.0.0.1:8090, inet_bind2_bucket_match_addr_any() will return true as tb->addr_type == IPV6_ADDR_ANY,
>
> Let's use tb2 here for inet_bind2_bucket.. yes it's not consistent
> in some functions like inet_bind2_bucket_match_addr_any() though.
yes, inet_bind2_bucket_match_addr_any() use tb, so I use tb here.
>
>
>> and tb is refer to the TCPv6 socket(:::8090), then inet_bhash2_conflict() return false, That is, there is no conflict,
>
> Also make it clear that the TCPv6 socket is ipv6only one.
>
>
>> so bind() succeeds.
>>
>> inet_bhash2_addr_any_conflict()
>> {
>> inet_bind_bucket_for_each(tb2, &head2->chain)
>> // tb2 is IPv6
>> if (inet_bind2_bucket_match_addr_any(tb2, net, port, l3mdev, sk))
>> break;
>>
>> // inet_bhash2_conflict() return false
>> if (tb2 && inet_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok,
>> reuseport_ok)) {
>> spin_unlock(&head2->lock);
>> return true;
>> }
>>
>> }
>>
>> Fixes: 5a22bba13d01 ("tcp: Save address type in inet_bind2_bucket.")
>
> This is not the commit that introduced the regression.
I will remove this.
>
> Also, you need Signed-off-by tag here.
OK.
>
>
>> ---
>> net/ipv4/inet_hashtables.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
>> index 7498af320164..3eeaca8a113f 100644
>> --- a/net/ipv4/inet_hashtables.c
>> +++ b/net/ipv4/inet_hashtables.c
>> @@ -830,8 +830,8 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
>> return false;
>>
>> #if IS_ENABLED(CONFIG_IPV6)
>> - if (tb->addr_type == IPV6_ADDR_ANY)
>> - return true;
>> + if (sk->sk_family == AF_INET6)
>> + return tb->addr_type == IPV6_ADDR_ANY;
>
> This fix is not correct and will break v4-mapped-v6 address cases.
> You can run bind_wildcard under the selftest directory.
>> Probably we need v6_only bit in tb2 and should add some test cases
> in the selftest.
How about this?
I add a new field ipv6_only to struct inet_bind2_bucket{}
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 7f1b38458743..fb7c250a663b 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -96,6 +96,7 @@ struct inet_bind2_bucket {
int l3mdev;
unsigned short port;
#if IS_ENABLED(CONFIG_IPV6)
+ bool ipv6_only;
unsigned short addr_type;
struct in6_addr v6_rcv_saddr;
#define rcv_saddr v6_rcv_saddr.s6_addr32[3]
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index cf88eca5f1b4..5fc749f8f2b1 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -110,6 +110,7 @@ static void inet_bind2_bucket_init(struct inet_bind2_bucket *tb2,
tb2->port = tb->port;
#if IS_ENABLED(CONFIG_IPV6)
BUILD_BUG_ON(USHRT_MAX < (IPV6_ADDR_ANY | IPV6_ADDR_MAPPED));
+ tb2->ipv6_only = ipv6_only_sock(sk);
if (sk->sk_family == AF_INET6) {
tb2->addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr);
tb2->v6_rcv_saddr = sk->sk_v6_rcv_saddr;
@@ -831,7 +832,8 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
#if IS_ENABLED(CONFIG_IPV6)
if (tb->addr_type == IPV6_ADDR_ANY)
- return true;
+ if (sk->sk_family == AF_INET6 || !tb->ipv6_only)
+ return true;
if (tb->addr_type != IPV6_ADDR_MAPPED)
return false;
>
>
>>
>> if (tb->addr_type != IPV6_ADDR_MAPPED)
>> return false;
Powered by blists - more mailing lists