[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250402072010.GA110656@j66a10360.sqa.eu95>
Date: Wed, 2 Apr 2025 15:20:10 +0800
From: "D. Wythe" <alibuda@...ux.alibaba.com >
To: Wang Liang <wangliang74@...wei.com>
Cc: Paolo Abeni <pabeni@...hat.com>, wenjia@...ux.ibm.com,
jaka@...ux.ibm.com, alibuda@...ux.alibaba.com,
tonylu@...ux.alibaba.com, guwen@...ux.alibaba.com,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
horms@...nel.org, ubraun@...ux.vnet.ibm.com, yanjun.zhu@...ux.dev,
yuehaibing@...wei.com, zhangchangzhong@...wei.com,
linux-rdma@...r.kernel.org, linux-s390@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net/smc: fix general protection fault in
__smc_diag_dump
On Wed, Apr 02, 2025 at 10:37:24AM +0800, Wang Liang wrote:
>
> 在 2025/4/1 19:01, Paolo Abeni 写道:
> >On 3/31/25 10:10 AM, Wang Liang wrote:
> >>diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> >>index 3e6cb35baf25..454801188514 100644
> >>--- a/net/smc/af_smc.c
> >>+++ b/net/smc/af_smc.c
> >>@@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
> >> sk->sk_protocol = protocol;
> >> WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
> >> WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
> >>+ smc->clcsock = NULL;
> >> INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
> >> INIT_WORK(&smc->connect_work, smc_connect_work);
> >> INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
> >The syzkaller report has a few reproducers, have you tested this? AFAICS
> >the smc socket is already zeroed on allocation by sk_alloc().
>
>
> Yes, I test it by the C repro:
> https://syzkaller.appspot.com/text?tag=ReproC&x=13d2dc98580000
>
> The C repro is provided by the 2025/02/27 15:16 crash from
> https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364
>
> After apply my patch, the crash no longer happens when running the C repro.
>
> "the smc socket is already zeroed on allocation by sk_alloc()", That
> is right.
> However, smc->clcsock may be modified indirectly in inet6_create().
> The process like this:
>
> __sys_socket
> __sys_socket_create
> sock_create
> __sock_create
> # pf->create
> inet6_create
> // init smc->clcsock = 0
> sk = sk_alloc()
>
> // set smc->clcsock to invalid address
> inet = inet_sk(sk);
> inet_assign_bit(IS_ICSK, sk, INET_PROTOSW_ICSK & answer_flags);
> inet6_set_bit(MC6_LOOP, sk);
> inet6_set_bit(MC6_ALL, sk);
>
> smc_inet_init_sock
> smc_sk_init
> // add sk to smc_hash
> smc_hash_sk
> sk_add_node(sk, head);
> smc_create_clcsk
> // set smc->clcsock
> sock_create_kern(..., &smc->clcsock);)
>
> So initialize smc->clcsock to NULL explicitly in smc_sk_init() can fix
> this crash scene. If the problem can be reproduced after this patch, I
> guess it is not the same reason, and fix it by another patch is more
> appropriate.
>
This is actually because the current smc_sock is not an inet_sock,
leading to two modules simultaneously modifying the same offset in
memory but interpreting its structure differently. I previously proposed
embedding an inet(6)_sock at the beginning of smc_sock, but the
community had some objections...
I'm not sure on the community's current stance on this matter, but if a
fix is absolutely necessary, my recommendation would still be to embed
an inet(6)_sock within the smc_sock structure
D.
> >
> >/P
> >
> >
Powered by blists - more mailing lists