[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1554248352.118779.171.camel@acm.org>
Date: Tue, 02 Apr 2019 16:39:12 -0700
From: Bart Van Assche <bvanassche@....org>
To: Leon Romanovsky <leon@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>
Cc: syzbot <syzbot+2e3e485d5697ea610460@...kaller.appspotmail.com>,
Matthew Wilcox <willy@...radead.org>, danielj@...lanox.com,
danitg@...lanox.com, dledford@...hat.com,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
parav@...lanox.com, swise@...ngridcomputing.com,
syzkaller-bugs@...glegroups.com
Subject: Re: WARNING in cma_exit_net
On Mon, 2019-04-01 at 21:29 +0300, Leon Romanovsky wrote:
> On Mon, Apr 01, 2019 at 02:45:54PM -0300, Jason Gunthorpe wrote:
> > On Mon, Apr 01, 2019 at 10:36:05AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: e3ecb83e Add linux-next specific files for 20190401
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=13bc36cd200000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=db6c9f2bfeb91a99
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=2e3e485d5697ea610460
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+2e3e485d5697ea610460@...kaller.appspotmail.com
> > >
> > > WARNING: CPU: 1 PID: 7 at drivers/infiniband/core/cma.c:4674
> > > cma_exit_net+0x327/0x390 drivers/infiniband/core/cma.c:4674
> > > Kernel panic - not syncing: panic_on_warn set ...
> >
> > Matt: This is why the WARN_ON(!xa_empty()) is so valuable. Magically
> > syzkaller can find something in this code is buggy.
> >
> > Mellanox is also showing a different testing failure over the weekend
> > (use after free or something) from your 'cma: Convert portspace IDRs
> > to XArray'
>
> This is what I see in my environment.
>
> [ 72.725596]
> ==================================================================
> [ 72.726017] BUG: KASAN: use-after-free in cma_check_port+0x86a/0xa20 [rdma_cm]
> [ 72.726263] Read of size 8 at addr ffff888069fde998 by task ucmatose/387
> [ 72.726460]
> [ 72.726550] CPU: 3 PID: 387 Comm: ucmatose Not tainted 5.1.0-rc2+ #253
> [ 72.726751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [ 72.727119] Call Trace:
> [ 72.727210] dump_stack+0x7c/0xc0
> [ 72.727342] print_address_description+0x6c/0x23c
> [ 72.727505] ? cma_check_port+0x86a/0xa20 [rdma_cm]
> [ 72.727666] kasan_report.cold.3+0x1c/0x35
> [ 72.727805] ? cma_check_port+0x86a/0xa20 [rdma_cm]
> [ 72.727977] ? cma_check_port+0x86a/0xa20 [rdma_cm]
> [ 72.728138] cma_check_port+0x86a/0xa20 [rdma_cm]
> [ 72.728306] rdma_bind_addr+0x11bc/0x1b00 [rdma_cm]
> [ 72.728465] ? find_held_lock+0x33/0x1c0
> [ 72.728597] ? cma_ndev_work_handler+0x180/0x180 [rdma_cm]
> [ 72.728756] ? wait_for_completion+0x3d0/0x3d0
> [ 72.728928] ucma_bind+0x120/0x160 [rdma_ucm]
> [ 72.729089] ? ucma_resolve_addr+0x1a0/0x1a0 [rdma_ucm]
> [ 72.729256] ucma_write+0x1f8/0x2b0 [rdma_ucm]
> [ 72.729409] ? ucma_open+0x260/0x260 [rdma_ucm]
> [ 72.729571] vfs_write+0x157/0x460
> [ 72.729688] ksys_write+0xb8/0x170
> [ 72.729828] ? __ia32_sys_read+0xb0/0xb0
> [ 72.729954] ? trace_hardirqs_off_caller+0x5b/0x160
> [ 72.730107] ? do_syscall_64+0x18/0x3c0
> [ 72.730243] do_syscall_64+0x95/0x3c0
> [ 72.730363] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 72.730508] RIP: 0033:0x7f6f1758fff8
> [ 72.730624] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00
> 00 f3 0f 1e fa 48 8d 05 25 77 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f
> 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4
> 55
> [ 72.731146] RSP: 002b:00007fff99f99088 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 72.731365] RAX: ffffffffffffffda RBX: 00007fff99f99090 RCX: 00007f6f1758fff8
> [ 72.731579] RDX: 0000000000000090 RSI: 00007fff99f99090 RDI: 0000000000000003
> [ 72.731814] RBP: 0000564942bd8ec0 R08: 0000564942bd9180 R09: 0000000000000000
> [ 72.732043] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [ 72.732262] R13: 0000000000000001 R14: 0000000000000000 R15: 00005649413cc470
> [ 72.732494]
> [ 72.732572] Allocated by task 381:
> [ 72.732692] __kasan_kmalloc.constprop.5+0xc1/0xd0
> [ 72.732857] cma_alloc_port+0x4d/0x160 [rdma_cm]
> [ 72.733006] rdma_bind_addr+0x14e7/0x1b00 [rdma_cm]
> [ 72.733153] ucma_bind+0x120/0x160 [rdma_ucm]
> [ 72.733299] ucma_write+0x1f8/0x2b0 [rdma_ucm]
> [ 72.733452] vfs_write+0x157/0x460
> [ 72.733569] ksys_write+0xb8/0x170
> [ 72.733675] do_syscall_64+0x95/0x3c0
> [ 72.733800] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 72.733956]
> [ 72.734029] Freed by task 381:
> [ 72.734133] __kasan_slab_free+0x12e/0x180
> [ 72.734284] kfree+0xed/0x290
> [ 72.734399] rdma_destroy_id+0x6b6/0x9e0 [rdma_cm]
> [ 72.734559] ucma_close+0x110/0x300 [rdma_ucm]
> [ 72.734701] __fput+0x25a/0x740
> [ 72.734832] task_work_run+0x10e/0x190
> [ 72.734959] do_exit+0x85e/0x29e0
> [ 72.735071] do_group_exit+0xf0/0x2e0
> [ 72.735182] get_signal+0x2e0/0x17e0
> [ 72.735304] do_signal+0x94/0x1570
> [ 72.735424] exit_to_usermode_loop+0xfa/0x130
> [ 72.735612] do_syscall_64+0x327/0x3c0
> [ 72.735756] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 72.735941]
> [ 72.736033] The buggy address belongs to the object at ffff888069fde990
> [ 72.736033] which belongs to the cache kmalloc-32 of size 32
> [ 72.736414] The buggy address is located 8 bytes inside of
> [ 72.736414] 32-byte region [ffff888069fde990, ffff888069fde9b0)
> [ 72.736777] The buggy address belongs to the page:
> [ 72.736940] page:ffffea0001a7f780 count:1 mapcount:0 mapping:ffff88806bc03980 index:0x0
> [ 72.737171] flags: 0x4000000000000200(slab)
> [ 72.737295] raw: 4000000000000200 dead000000000100 dead000000000200 ffff88806bc03980
> [ 72.737525] raw: 0000000000000000 0000000000550055 00000001ffffffff 0000000000000000
> [ 72.737786] page dumped because: kasan: bad access detected
> [ 72.737948]
> [ 72.738019] Memory state around the buggy address:
> [ 72.738164] ffff888069fde880: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
> [ 72.738396] ffff888069fde900: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
> [ 72.738627] >ffff888069fde980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
> [ 72.738869] ^
> [ 72.738999] ffff888069fdea00: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
> [ 72.739213] ffff888069fdea80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
> [ 72.739431]
> ==================================================================
> [ 72.739667] Disabling lock debugging due to kernel taint
This is what I encountered while running blktests:
nvmet: adding nsid 1 to subsystem nvme-test
==================================================================
BUG: KASAN: use-after-free in cma_check_port+0x28/0x400 [rdma_cm]
Read of size 8 at addr ffff8880ba96f818 by task ln/10510
CPU: 5 PID: 10510 Comm: ln Not tainted 5.1.0-rc3-dbg+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x86/0xca
print_address_description+0x71/0x239
? cma_check_port+0x28/0x400 [rdma_cm]
kasan_report.cold.3+0x1b/0x3e
? cma_check_port+0x28/0x400 [rdma_cm]
__asan_load8+0x54/0x90
cma_check_port+0x28/0x400 [rdma_cm]
rdma_bind_addr+0xc13/0xe80 [rdma_cm]
? cma_ndev_work_handler+0xf0/0xf0 [rdma_cm]
? lockdep_hardirqs_on+0x185/0x260
? _raw_spin_unlock_irqrestore+0x57/0x70
? trace_hardirqs_on+0x24/0x130
? preempt_count_sub+0x18/0xd0
? _raw_spin_unlock_irqrestore+0x42/0x70
nvmet_rdma_add_port+0x143/0x1a0 [nvmet_rdma]
? nvmet_rdma_remove_port+0x40/0x40 [nvmet_rdma]
nvmet_enable_port+0x85/0x180 [nvmet]
nvmet_port_subsys_allow_link+0x1bc/0x1e0 [nvmet]
? do_raw_spin_unlock+0xa8/0x140
configfs_symlink+0x2b6/0x650
? configfs_get_link+0x3e0/0x3e0
? inode_permission+0x69/0x200
vfs_symlink+0x163/0x230
do_symlinkat+0xeb/0x160
? __ia32_sys_unlink+0x40/0x40
? do_syscall_64+0x19/0x210
? entry_SYSCALL_64_after_hwframe+0x49/0xbe
__x64_sys_symlinkat+0x43/0x50
do_syscall_64+0x71/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f35b984d9e7
Code: 73 01 c3 48 8b 0d a9 84 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 0a 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 84 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffcc3a88af8 EFLAGS: 00000246 ORIG_RAX: 000000000000010a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f35b984d9e7
RDX: 000055c1855f82b0 RSI: 00000000ffffff9c RDI: 00007ffcc3a8a7a6
RBP: 00000000ffffff9c R08: 000055c1855f8010 R09: 0000000000000000
R10: fffffffffffff000 R11: 0000000000000246 R12: 000055c1855f82b0
R13: 00007ffcc3a8a7a6 R14: 0000000000000000 R15: 0000000000000000
Allocated by task 10270:
save_stack+0x43/0xd0
__kasan_kmalloc.constprop.9+0xc7/0xd0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x143/0x350
cma_alloc_port+0x3d/0xf0 [rdma_cm]
rdma_bind_addr+0xdf9/0xe80 [rdma_cm]
nvmet_clear_ctrl+0x43/0x70 [nvmet]
rxe_opcode+0x15f5/0xfffffffffffef380 [rdma_rxe]
rxe_wr_opcode_info+0xa4c/0xfffffffffffeb360 [rdma_rxe]
configfs_symlink+0x2b6/0x650
vfs_symlink+0x163/0x230
do_symlinkat+0xeb/0x160
__x64_sys_symlinkat+0x43/0x50
do_syscall_64+0x71/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 10340:
save_stack+0x43/0xd0
__kasan_slab_free+0x139/0x190
kasan_slab_free+0xe/0x10
kfree+0x103/0x320
rdma_destroy_id+0x42c/0x460 [rdma_cm]
nvmet_ctrl_fatal_error+0x31/0x80 [nvmet]
rxe_opcode+0x175d/0xfffffffffffef380 [rdma_rxe]
rxe_wr_opcode_info+0x644/0xfffffffffffeb360 [rdma_rxe]
configfs_unlink+0x216/0x350
vfs_unlink+0x171/0x260
do_unlinkat+0x347/0x490
__x64_sys_unlinkat+0x60/0x90
do_syscall_64+0x71/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8880ba96f810
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 8 bytes inside of
32-byte region [ffff8880ba96f810, ffff8880ba96f830)
The buggy address belongs to the page:
page:ffffea0002ea5bc0 count:1 mapcount:0 mapping:ffff88811b003800 index:0x0
flags: 0x1fff000000000200(slab)
raw: 1fff000000000200 ffffea0004319200 0000000900000009 ffff88811b003800
raw: 0000000000000000 0000000000550055 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880ba96f700: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
ffff8880ba96f780: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
>ffff8880ba96f800: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
^
ffff8880ba96f880: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
ffff8880ba96f900: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint
nvmet_rdma: enabling port 1 (192.168.122.49:7777)
Powered by blists - more mailing lists