lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f8802d1-ea15-49b6-b9d9-1e53fb76a264@linux.dev>
Date: Wed, 5 Nov 2025 12:10:42 -0800
From: "Yanjun.Zhu" <yanjun.zhu@...ux.dev>
To: Leon Romanovsky <leon@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>,
 syzbot <syzbot+b0da83a6c0e2e2bddbd4@...kaller.appspotmail.com>
Cc: linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [rdma?] WARNING in gid_table_release_one (3)


On 11/5/25 10:50 AM, Leon Romanovsky wrote:
>
> On Wed, Nov 5, 2025, at 19:14, Jason Gunthorpe wrote:
>> On Wed, Nov 05, 2025 at 09:06:04AM -0800, syzbot wrote:
>>> Hello,
>>>
>>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>>> WARNING in gid_table_release_one
>>>
>>> ------------[ cut here ]------------
>>> GID entry ref leak for dev syz1 index 2 ref=363, state: 3
>>> WARNING: CPU: 1 PID: 50 at drivers/infiniband/core/cache.c:827 release_gid_table drivers/infiniband/core/cache.c:824 [inline]
>>> WARNING: CPU: 1 PID: 50 at drivers/infiniband/core/cache.c:827 gid_table_release_one+0x5ae/0x6c0 drivers/infiniband/core/cache.c:904
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 50 Comm: kworker/u8:3 Not tainted syzkaller #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
>>> Workqueue: ib-unreg-wq ib_unregister_work
>>> RIP: 0010:release_gid_table drivers/infiniband/core/cache.c:824 [inline]
>>> RIP: 0010:gid_table_release_one+0x5ae/0x6c0 drivers/infiniband/core/cache.c:904
>>> Code: e8 03 0f b6 04 28 84 c0 0f 85 cc 00 00 00 44 8b 03 48 c7 c7 60 7c 2b 8c 48 8b 74 24 28 44 89 fa 8b 4c 24 50 e8 73 e7 35 f9 90 <0f> 0b 90 90 44 8b 74 24 04 4c 8b 7c 24 20 4c 8b 64 24 48 e9 15 fe
>>> RSP: 0018:ffffc90000bb78f8 EFLAGS: 00010246
>>> RAX: 124fa0acf3bf2700 RBX: ffff8880268c1990 RCX: ffff888020289e40
>>> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000002
>>> RBP: dffffc0000000000 R08: 0000000000000003 R09: 0000000000000004
>>> R10: dffffc0000000000 R11: fffffbfff1b7a678 R12: ffff88802ed4e2d8
>>> R13: 00000000000001a8 R14: ffff88806a158010 R15: 0000000000000002
>>> FS:  0000000000000000(0000) GS:ffff88812646a000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00005555712ce808 CR3: 000000006b6c8000 CR4: 00000000003526f0
>>> Call Trace:
>>>   <TASK>
>>>   ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
>>>   device_release+0x9c/0x1c0 drivers/base/core.c:-1
>>>   kobject_cleanup lib/kobject.c:689 [inline]
>>>   kobject_release lib/kobject.c:720 [inline]
>>>   kref_put include/linux/kref.h:65 [inline]
>>>   kobject_put+0x22b/0x480 lib/kobject.c:737
>>>   process_one_work kernel/workqueue.c:3263 [inline]
>>>   process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
>>>   worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
>>>   kthread+0x711/0x8a0 kernel/kthread.c:463
>>>   ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
>>>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>>   </TASK>
>>>
>>>
>>> Tested on:
>>>
>>> commit:         ad2cc78b RDMA/core: Fix WARNING in gid_table_release_one
>>> git tree:       https://github.com/zhuyj/linux.git v6.17_fix_gid_table_release_one
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11dfa17c580000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=2c614fa9e6f5bdc1
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
>>> compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>> I think this disproves the theory that the the gid is sitting in a
>> work queue waiting to be cleaned up..
> Yes, this is makes more sense to me when multiple ib_wq flush.
>> So we still need to find out what is holding on to the reference...

It’s still unclear what is holding the reference. From my tests, if we 
wait here for a short time, all the references are eventually released. 
It’s quite strange.

Yanjun.Zhu

>>
>> Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ