[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ecc8b532-4e80-b7bd-3621-78cd55fd48fa@huawei.com>
Date: Tue, 22 Nov 2022 11:28:28 +0800
From: wangyufen <wangyufen@...wei.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Dmitry Vyukov <dvyukov@...gle.com>
CC: syzbot <syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com>,
Leon Romanovsky <leon@...nel.org>, <chenzhongjin@...wei.com>,
RDMA mailing list <linux-rdma@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
<syzkaller-bugs@...glegroups.com>,
Zhu Yanjun <zyjzyj2000@...il.com>,
Bob Pearson <rpearsonhpe@...il.com>
Subject: Re: [syzbot] unregister_netdevice: waiting for DEV to become free (7)
在 2022/11/22 10:13, Jason Gunthorpe 写道:
> On Fri, Nov 18, 2022 at 02:28:53PM +0100, Dmitry Vyukov wrote:
>> On Fri, 18 Nov 2022 at 12:39, syzbot
>> <syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com> wrote:
>>>
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: 9c8774e629a1 net: eql: Use kzalloc instead of kmalloc/memset
>>> git tree: net-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000
>>>
>>> Bisection is inconclusive: the issue happens on the oldest tested release.
>>>
>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
>>> final oops: https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com
>>>
>>> iwpm_register_pid: Unable to send a nlmsg (client = 2)
>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>> unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
>>
>> +RDMA maintainers
>>
>> There are 4 reproducers and all contain:
>>
>> r0 = socket$nl_rdma(0x10, 0x3, 0x14)
>> sendmsg$RDMA_NLDEV_CMD_NEWLINK(...)
>>
>> Also the preceding print looks related (a bug in the error handling
>> path there?):
>>
>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>
> I'm pretty sure it is an rxe bug
>
> ib_device_set_netdev() will hold the netdev until the caller destroys
> the ib_device
>
> rxe calls it during rxe_register_device() because the user asked for a
> stacked ib_device on top of the netdev
>
> Presumably rxe needs to have a notifier to also self destroy the rxe
> device if the underlying net device is to be destroyed?
>
> Can someone from rxe check into this?
The following patch may fix the issue:
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4049,6 +4049,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
return 0;
err:
id_priv->backlog = 0;
+ if (id_priv->cma_dev)
+ cma_release_dev(id_priv);
+
/*
* All the failure paths that lead here will not allow the
req_handler's
* to have run.
The causes are as follows:
rdma_listen()
rdma_bind_addr()
cma_acquire_dev_by_src_ip()
cma_attach_to_dev()
_cma_attach_to_dev()
cma_dev_get()
cma_check_port()
<--The return value is not zero, goto err
err:
<-- The error handling here is missing the operation of cma_release_dev.
>
> Jason
Powered by blists - more mailing lists