lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Nov 2022 11:28:28 +0800
From:   wangyufen <wangyufen@...wei.com>
To:     Jason Gunthorpe <jgg@...pe.ca>, Dmitry Vyukov <dvyukov@...gle.com>
CC:     syzbot <syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com>,
        Leon Romanovsky <leon@...nel.org>, <chenzhongjin@...wei.com>,
        RDMA mailing list <linux-rdma@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
        <syzkaller-bugs@...glegroups.com>,
        Zhu Yanjun <zyjzyj2000@...il.com>,
        Bob Pearson <rpearsonhpe@...il.com>
Subject: Re: [syzbot] unregister_netdevice: waiting for DEV to become free (7)



在 2022/11/22 10:13, Jason Gunthorpe 写道:
> On Fri, Nov 18, 2022 at 02:28:53PM +0100, Dmitry Vyukov wrote:
>> On Fri, 18 Nov 2022 at 12:39, syzbot
>> <syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com> wrote:
>>>
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    9c8774e629a1 net: eql: Use kzalloc instead of kmalloc/memset
>>> git tree:       net-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17bf6cc8f00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=9eb259db6b1893cf
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
>>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1136d592f00000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1193ae64f00000
>>>
>>> Bisection is inconclusive: the issue happens on the oldest tested release.
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=167c33a2f00000
>>> final oops:     https://syzkaller.appspot.com/x/report.txt?x=157c33a2f00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=117c33a2f00000
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+5e70d01ee8985ae62a3b@...kaller.appspotmail.com
>>>
>>> iwpm_register_pid: Unable to send a nlmsg (client = 2)
>>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
>>> unregister_netdevice: waiting for vlan0 to become free. Usage count = 2
>>
>> +RDMA maintainers
>>
>> There are 4 reproducers and all contain:
>>
>> r0 = socket$nl_rdma(0x10, 0x3, 0x14)
>> sendmsg$RDMA_NLDEV_CMD_NEWLINK(...)
>>
>> Also the preceding print looks related (a bug in the error handling
>> path there?):
>>
>> infiniband syj1: RDMA CMA: cma_listen_on_dev, error -98
> 
> I'm pretty sure it is an rxe bug
> 
> ib_device_set_netdev() will hold the netdev until the caller destroys
> the ib_device
> 
> rxe calls it during rxe_register_device() because the user asked for a
> stacked ib_device on top of the netdev
> 
> Presumably rxe needs to have a notifier to also self destroy the rxe
> device if the underlying net device is to be destroyed?
> 
> Can someone from rxe check into this?

The following patch may fix the issue:

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4049,6 +4049,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
         return 0;
  err:
         id_priv->backlog = 0;
+       if (id_priv->cma_dev)
+               cma_release_dev(id_priv);
+
         /*
          * All the failure paths that lead here will not allow the 
req_handler's
          * to have run.



The causes are as follows:

rdma_listen()
   rdma_bind_addr()
     cma_acquire_dev_by_src_ip()
       cma_attach_to_dev()
         _cma_attach_to_dev()
	      cma_dev_get()
		
   cma_check_port()
   <--The return value is not zero, goto err

err:
<-- The error handling here is missing the operation of cma_release_dev.

> 
> Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ