[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bd122d2-d606-b71e-dbe7-63fa293f0a73@huawei.com>
Date: Tue, 1 Nov 2022 15:08:15 +0800
From: "Ziyang Xuan (William)" <william.xuanziyang@...wei.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Joanne Koong <joannelkoong@...il.com>,
Martin KaFai Lau <martin.lau@...nel.org>,
Mat Martineau <mathew.j.martineau@...ux.intel.com>,
Paolo Abeni <pabeni@...hat.com>
CC: Kuniyuki Iwashima <kuni1840@...il.com>, <netdev@...r.kernel.org>
Subject: Re: [RFC] bhash2 and WARN_ON() for inconsistent sk saddr.
Hello Kuniyuki Iwashima,
> Hi,
>
> I want to discuss bhash2 and WARN_ON() being fired every day this month
> on my syzkaller instance without repro.
>
> WARNING: CPU: 0 PID: 209 at net/ipv4/inet_connection_sock.c:548 inet_csk_get_port (net/ipv4/inet_connection_sock.c:548 (discriminator 1))
> ...
> inet_csk_listen_start (net/ipv4/inet_connection_sock.c:1205)
> inet_listen (net/ipv4/af_inet.c:228)
> __sys_listen (net/socket.c:1810)
> __x64_sys_listen (net/socket.c:1819 net/socket.c:1817 net/socket.c:1817)
> do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
>
> For the very first implementation of bhash2, there was a similar report
> hitting the same WARN_ON(). The fix was to update the bhash2 bucket when
> the kernel changes sk->sk_rcv_saddr from INADDR_ANY. Then, syzbot has a
> repro, so we can indeed confirm that it no longer triggers the warning on
> the latest kernel.
>
> https://lore.kernel.org/netdev/0000000000003f33bc05dfaf44fe@google.com/
>
> However, Mat reported at that time that there were at least two variants,
> the latter being the same as mine.
>
> https://lore.kernel.org/netdev/4bae9df4-42c1-85c3-d350-119a151d29@linux.intel.com/
> https://lore.kernel.org/netdev/23d8e9f4-016-7de1-9737-12c3146872ca@linux.intel.com/
>
> This week I started looking into this issue and finally figured out
> why we could not catch all cases with a single repro.
>
Provide another C repro for analysis. See the attachment.
>
> Now, I'm thinking bhash2 bucket needs a refcnt not to be freed while
> refcnt is greater than 1. And we need to change the conflict logic
> so that the kernel ignores empty bhash2 bucket. Such changes could
> be big for the net tree, but the next LTS will likely be v6.1 which
> has bhash2.
>
> What do you think is the best way to fix the issue?
>
> Thank you.
>
>
> ---8<---
> diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> index 713b7b8dad7e..40640c26680e 100644
> --- a/net/dccp/ipv4.c
> +++ b/net/dccp/ipv4.c
> @@ -157,6 +157,8 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> * This unhashes the socket and releases the local port, if necessary.
> */
> dccp_set_state(sk, DCCP_CLOSED);
> + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> + inet_reset_saddr(sk);
> ip_rt_put(rt);
> sk->sk_route_caps = 0;
> inet->inet_dport = 0;
> diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> index e57b43006074..626166cb6d7e 100644
> --- a/net/dccp/ipv6.c
> +++ b/net/dccp/ipv6.c
> @@ -985,6 +985,8 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
>
> late_failure:
> dccp_set_state(sk, DCCP_CLOSED);
> + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> + inet_reset_saddr(sk);
> __sk_dst_reset(sk);
> failure:
> inet->inet_dport = 0;
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 7a250ef9d1b7..834245da1e95 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -343,6 +343,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
> * if necessary.
> */
> tcp_set_state(sk, TCP_CLOSE);
> + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> + inet_reset_saddr(sk);
> ip_rt_put(rt);
> sk->sk_route_caps = 0;
> inet->inet_dport = 0;
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 2a3f9296df1e..81b396e5cf79 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -359,6 +359,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
>
> late_failure:
> tcp_set_state(sk, TCP_CLOSE);
> + if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
> + inet_reset_saddr(sk);
> failure:
> inet->inet_dport = 0;
> sk->sk_route_caps = 0;
> ---8<---
> .
>
View attachment "warning_on_for_bhash2.c" of type "text/plain" (1039 bytes)
Powered by blists - more mailing lists