linux-kernel - Re: [syzbot] [rdma?] WARNING in rxe_skb_tx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33c75c28-878a-4427-a5c6-ff6b49cc4fe1@linux.dev>
Date: Tue, 13 May 2025 16:57:31 +0200
From: Zhu Yanjun <yanjun.zhu@...ux.dev>
To: syzbot <syzbot+8425ccfb599521edb153@...kaller.appspotmail.com>,
 jgg@...pe.ca, leon@...nel.org, linux-kernel@...r.kernel.org,
 linux-rdma@...r.kernel.org, syzkaller-bugs@...glegroups.com,
 zyjzyj2000@...il.com
Subject: Re: [syzbot] [rdma?] WARNING in rxe_skb_tx_dtor



On 02.05.25 11:54, Zhu Yanjun wrote:
> On 01.05.25 18:45, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    8bac8898fe39 Merge tag 'mmc-v6.15-rc1' of 
>> git://git.kernel..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=16b6d774580000
>> kernel config:  
>> https://syzkaller.appspot.com/x/.config?x=a9a25b7a36123454
>> dashboard link: 
>> https://syzkaller.appspot.com/bug?extid=8425ccfb599521edb153
>> compiler:       Debian clang version 20.1.2 
>> (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 
>> 20.1.2
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image (non-bootable): 
>> https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-8bac8898.raw.xz
>> vmlinux: 
>> https://storage.googleapis.com/syzbot-assets/2a76d594c0f5/vmlinux-8bac8898.xz
>> kernel image: 
>> https://storage.googleapis.com/syzbot-assets/dae09c25780d/bzImage-8bac8898.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the 
>> commit:
>> Reported-by: syzbot+8425ccfb599521edb153@...kaller.appspotmail.com
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1046 at drivers/infiniband/sw/rxe/rxe_net.c:357 
>> rxe_skb_tx_dtor+0x8b/0x2a0 drivers/infiniband/sw/rxe/rxe_net.c:357
> 
> This is a known problem. It seems to be related with the following commit.
> 
> commit 1a633bdc8fd9e9e4a9f9a668ae122edfc5aacc86
> Author: Bob Pearson <rpearsonhpe@...il.com>
> Date:   Fri Mar 29 09:55:15 2024 -0500
> 
>      RDMA/rxe: Let destroy qp succeed with stuck packet
> 
>      In some situations a sent packet may get queued in the NIC longer than
>      than timeout of a ULP. Currently if this happens the ULP may try to 
> reset
>      the link by destroying the qp and setting up an alternate 
> connection but
>      will fail because the rxe driver is waiting for the packet to finish
>      getting sent and be returned to the skb destructor function where 
> the qp
>      reference holding things up will be dropped. This patch modifies 
> the way
>      that the qp is passed to the destructor to pass the qp index and 
> not a qp
>      pointer.  Then the destructor will attempt to lookup the qp from 
> its index
>      and if it fails exit early. This requires taking a reference on the 
> struct
>      sock rather than the qp allowing the qp to be destroyed while the 
> sk is
>      still around waiting for the packet to finish.
> 
>      Link: 
> https://lore.kernel.org/r/20240329145513.35381-15-rpearsonhpe@gmail.com
>      Signed-off-by: Bob Pearson <rpearsonhpe@...il.com>
>      Signed-off-by: Jason Gunthorpe <jgg@...dia.com>


I can not reproduce this problem in my local host. It seems that this 
problem will occur when rdma SKBs are still in flight after the related 
qp was destroyed.

In this case, a solution is to make qp wait for 200 milliseconds if some 
rdma SKBs are still in flight before a qp is destroyed.

Please verify this problem still occur or not with the following diff.


diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c 
b/drivers/infiniband/sw/rxe/rxe_qp.c
index 7975fb0e2782..8646ebe1c85d 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -862,10 +862,20 @@ static void rxe_qp_do_cleanup(struct work_struct 
*work)
         }
  }

+#define        RXE_CLEANUP_SKB_IN_FLIGHT_TIMEOUT       200
+
  /* called when the last reference to the qp is dropped */
  void rxe_qp_cleanup(struct rxe_pool_elem *elem)
  {
         struct rxe_qp *qp = container_of(elem, typeof(*qp), elem);
+       int cnt = RXE_CLEANUP_SKB_IN_FLIGHT_TIMEOUT;
+
+       /* Before qp is cleanup, check skb in flight */
+       while (cnt && atomic_read(&qp->skb_out) > 0) {
+               msleep(1);
+               cnt--;
+               cond_resched();
+       }

         execute_in_process_context(rxe_qp_do_cleanup, &qp->cleanup_work);
  }

Best Regards,
Zhu Yanjun

> 
> Zhu Yanjun
> 
>> Modules linked in:
>> CPU: 0 UID: 0 PID: 1046 Comm: kworker/u4:9 Not tainted 
>> 6.15.0-rc4-syzkaller-00040-g8bac8898fe39 #0 PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
>> 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>> Workqueue: rxe_wq do_work
>> RIP: 0010:rxe_skb_tx_dtor+0x8b/0x2a0 
>> drivers/infiniband/sw/rxe/rxe_net.c:357
>> Code: 80 3c 20 00 74 08 4c 89 ff e8 41 ee 8c f9 4d 8b 37 44 89 f6 83 
>> e6 01 31 ff e8 11 fe 2a f9 41 f6 c6 01 75 0e e8 26 f9 2a f9 90 <0f> 0b 
>> 90 e9 b4 01 00 00 4c 89 ff e8 35 c4 fa 01 48 89 c7 be 0e 00
>> RSP: 0018:ffffc90000007a08 EFLAGS: 00010246
>> RAX: ffffffff8894c5aa RBX: ffff88803ec8d280 RCX: ffff888035088000
>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: ffffffff886e3f04 R12: dffffc0000000000
>> R13: 1ffff11007d91a5b R14: 0000000000025820 R15: ffff888034060000
>> FS:  0000000000000000(0000) GS:ffff88808d6cc000(0000) 
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f7c6d874fc8 CR3: 00000000428c8000 CR4: 0000000000352ef0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>   <IRQ>
>>   skb_release_head_state+0xfe/0x250 net/core/skbuff.c:1149
>>   napi_consume_skb+0xd2/0x1e0 net/core/skbuff.c:-1
>>   e1000_unmap_and_free_tx_resource 
>> drivers/net/ethernet/intel/e1000/e1000_main.c:1972 [inline]
>>   e1000_clean_tx_irq 
>> drivers/net/ethernet/intel/e1000/e1000_main.c:3864 [inline]
>>   e1000_clean+0x49d/0x2b00 
>> drivers/net/ethernet/intel/e1000/e1000_main.c:3805
>>   __napi_poll+0xc4/0x480 net/core/dev.c:7324
>>   napi_poll net/core/dev.c:7388 [inline]
>>   net_rx_action+0x6ea/0xdf0 net/core/dev.c:7510
>>   handle_softirqs+0x283/0x870 kernel/softirq.c:579
>>   do_softirq+0xec/0x180 kernel/softirq.c:480
>>   </IRQ>
>>   <TASK>
>>   __local_bh_enable_ip+0x17d/0x1c0 kernel/softirq.c:407
>>   local_bh_enable include/linux/bottom_half.h:33 [inline]
>>   rcu_read_unlock_bh include/linux/rcupdate.h:910 [inline]
>>   __dev_queue_xmit+0x1cd7/0x3a70 net/core/dev.c:4656
>>   neigh_output include/net/neighbour.h:539 [inline]
>>   ip6_finish_output2+0x11fb/0x16a0 net/ipv6/ip6_output.c:141
>>   __ip6_finish_output net/ipv6/ip6_output.c:-1 [inline]
>>   ip6_finish_output+0x234/0x7d0 net/ipv6/ip6_output.c:226
>>   rxe_send drivers/infiniband/sw/rxe/rxe_net.c:391 [inline]
>>   rxe_xmit_packet+0x79e/0xa30 drivers/infiniband/sw/rxe/rxe_net.c:450
>>   rxe_requester+0x1fea/0x3d20 drivers/infiniband/sw/rxe/rxe_req.c:805
>>   rxe_sender+0x16/0x50 drivers/infiniband/sw/rxe/rxe_req.c:839
>>   do_task+0x1ad/0x6b0 drivers/infiniband/sw/rxe/rxe_task.c:127
>>   process_one_work kernel/workqueue.c:3238 [inline]
>>   process_scheduled_works+0xadb/0x17a0 kernel/workqueue.c:3319
>>   worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
>>   kthread+0x70e/0x8a0 kernel/kthread.c:464
>>   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
>>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>   </TASK>
>>
>>
>> ---
>> This report is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at syzkaller@...glegroups.com.
>>
>> syzbot will keep track of this issue. See:
>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>
>> If the report is already addressed, let syzbot know by replying with:
>> #syz fix: exact-commit-title
>>
>> If you want to overwrite report's subsystems, reply with:
>> #syz set subsystems: new-subsystem
>> (See the list of subsystem names on the web dashboard)
>>
>> If the report is a duplicate of another one, reply with:
>> #syz dup: exact-subject-of-another-report
>>
>> If you want to undo deduplication, reply with:
>> #syz undup
> 

-- 
Best Regards,
Yanjun.Zhu