[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd31d24b-8a34-4da5-b7ed-3ebf9aa30d1d@huawei.com>
Date: Thu, 3 Apr 2025 15:09:13 +0800
From: Wang Liang <wangliang74@...wei.com>
To: "D. Wythe" <alibuda@...ux.alibaba.com>
CC: Paolo Abeni <pabeni@...hat.com>, <wenjia@...ux.ibm.com>,
<jaka@...ux.ibm.com>, <tonylu@...ux.alibaba.com>, <guwen@...ux.alibaba.com>,
<davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<horms@...nel.org>, <ubraun@...ux.vnet.ibm.com>, <yanjun.zhu@...ux.dev>,
<yuehaibing@...wei.com>, <zhangchangzhong@...wei.com>,
<linux-rdma@...r.kernel.org>, <linux-s390@...r.kernel.org>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] net/smc: fix general protection fault in
__smc_diag_dump
在 2025/4/2 15:20, D. Wythe 写道:
> On Wed, Apr 02, 2025 at 10:37:24AM +0800, Wang Liang wrote:
>> 在 2025/4/1 19:01, Paolo Abeni 写道:
>>> On 3/31/25 10:10 AM, Wang Liang wrote:
>>>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>>>> index 3e6cb35baf25..454801188514 100644
>>>> --- a/net/smc/af_smc.c
>>>> +++ b/net/smc/af_smc.c
>>>> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>>>> sk->sk_protocol = protocol;
>>>> WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>>>> WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
>>>> + smc->clcsock = NULL;
>>>> INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>>>> INIT_WORK(&smc->connect_work, smc_connect_work);
>>>> INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
>>> The syzkaller report has a few reproducers, have you tested this? AFAICS
>>> the smc socket is already zeroed on allocation by sk_alloc().
>>
>> Yes, I test it by the C repro:
>> https://syzkaller.appspot.com/text?tag=ReproC&x=13d2dc98580000
>>
>> The C repro is provided by the 2025/02/27 15:16 crash from
>> https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364
>>
>> After apply my patch, the crash no longer happens when running the C repro.
>>
>> "the smc socket is already zeroed on allocation by sk_alloc()", That
>> is right.
>> However, smc->clcsock may be modified indirectly in inet6_create().
>> The process like this:
>>
>> __sys_socket
>> __sys_socket_create
>> sock_create
>> __sock_create
>> # pf->create
>> inet6_create
>> // init smc->clcsock = 0
>> sk = sk_alloc()
>>
>> // set smc->clcsock to invalid address
>> inet = inet_sk(sk);
>> inet_assign_bit(IS_ICSK, sk, INET_PROTOSW_ICSK & answer_flags);
>> inet6_set_bit(MC6_LOOP, sk);
>> inet6_set_bit(MC6_ALL, sk);
>>
>> smc_inet_init_sock
>> smc_sk_init
>> // add sk to smc_hash
>> smc_hash_sk
>> sk_add_node(sk, head);
>> smc_create_clcsk
>> // set smc->clcsock
>> sock_create_kern(..., &smc->clcsock);)
>>
>> So initialize smc->clcsock to NULL explicitly in smc_sk_init() can fix
>> this crash scene. If the problem can be reproduced after this patch, I
>> guess it is not the same reason, and fix it by another patch is more
>> appropriate.
>>
> This is actually because the current smc_sock is not an inet_sock,
> leading to two modules simultaneously modifying the same offset in
> memory but interpreting its structure differently. I previously proposed
> embedding an inet(6)_sock at the beginning of smc_sock, but the
> community had some objections...
>
> I'm not sure on the community's current stance on this matter, but if a
> fix is absolutely necessary, my recommendation would still be to embed
> an inet(6)_sock within the smc_sock structure
>
> D.
At present, I think initializing the smc in smc_sk_init() may be the
most simple and effective method. :P
>
>>> /P
>>>
>>>
Powered by blists - more mailing lists