[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADUfDZqtiT8B_LvTRuzT9QB+7z+7pNqYJd_n2gQYK1d8cKkxqA@mail.gmail.com>
Date: Mon, 23 Dec 2024 12:52:58 -0800
From: Caleb Sander <csander@...estorage.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: syzbot <syzbot+3dcac84cc1d50f43ed31@...kaller.appspotmail.com>,
asml.silence@...il.com, io-uring@...r.kernel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>, Hannes Reinecke <hare@...e.de>,
Sagi Grimberg <sagi@...mberg.me>
Subject: Re: [syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer
dereference in percpu_ref_put_many
This is probably the same bug that is being addressed by
https://lore.kernel.org/lkml/20241218185000.17920-2-leocstone@gmail.com/T/
On Mon, Dec 23, 2024 at 12:35 PM Jens Axboe <axboe@...nel.dk> wrote:
>
> On 12/23/24 12:52 PM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
> > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=135f7730580000
>
> I ran this one but his this instead:
>
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in nvmet_root_discovery_nqn_store+0x110/0x180
> Write of size 256 at addr ffff000009e71180 by task refcrash/775
>
> CPU: 0 UID: 0 PID: 775 Comm: refcrash Not tainted 6.13.0-rc4 #2
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> show_stack+0x1c/0x30 (C)
> __dump_stack+0x24/0x30
> dump_stack_lvl+0x60/0x80
> print_address_description+0x88/0x220
> print_report+0x4c/0x60
> kasan_report+0x94/0xf0
> kasan_check_range+0x248/0x288
> __asan_memset+0x30/0x60
> nvmet_root_discovery_nqn_store+0x110/0x180
> configfs_write_iter+0x220/0x2e8
> do_iter_readv_writev+0x2e0/0x458
> vfs_writev+0x220/0x728
> do_writev+0xf8/0x1a8
> __arm64_sys_writev+0x80/0x98
> invoke_syscall+0x7c/0x258
> el0_svc_common+0x108/0x1d0
> do_el0_svc+0x4c/0x60
> el0_svc+0x4c/0xa0
> el0t_64_sync_handler+0x70/0x100
> el0t_64_sync+0x170/0x178
>
> Allocated by task 1:
> kasan_save_track+0x2c/0x60
> kasan_save_alloc_info+0x3c/0x48
> __kasan_kmalloc+0x80/0x98
> __kmalloc_node_track_caller_noprof+0x2f0/0x590
> kstrndup+0x4c/0xb8
> nvmet_subsys_alloc+0x1c4/0x498
> nvmet_init_discovery+0x20/0x48
> nvmet_init+0x18c/0x1c0
> do_one_initcall+0x1a4/0x718
> do_initcall_level+0x178/0x348
> do_initcalls+0x58/0xa0
> do_basic_setup+0x7c/0x98
> kernel_init_freeable+0x268/0x380
> kernel_init+0x24/0x148
> ret_from_fork+0x10/0x20
>
> The buggy address belongs to the object at ffff000009e71180
> which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 0 bytes inside of
> allocated 37-byte region [ffff000009e71180, ffff000009e711a5)
>
> The buggy address belongs to the physical page:
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x49e71
> anon flags: 0x3ffe00000000000(node=0|zone=0|lastcpupid=0x1fff)
> page_type: f5(slab)
> raw: 03ffe00000000000 ffff0000070028c0 fffffdffc0523d80 dead000000000005
> raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff000009e71080: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> ffff000009e71100: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
> >ffff000009e71180: 00 00 00 00 05 fc fc fc fc fc fc fc fc fc fc fc
> Zero length message leads to an empty skb
> ^
> ffff000009e71200: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> ffff000009e71280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> ==================================================================
> Disabling lock debugging due to kernel taint
>
> which makes me think something else is the culprit here. The test case
> doesn't do much outside of creating two rings, it doesn't actually use
> them.
>
> CC'ing likely suspects on the nvme front. This is on 6.13-rc4 fwiw.
>
> --
> Jens Axboe
>
Powered by blists - more mailing lists