[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <042385ef-285e-5179-941b-ab37f490c1d8@grimberg.me>
Date: Tue, 21 Mar 2023 13:09:03 +0200
From: Sagi Grimberg <sagi@...mberg.me>
To: Lei Lei2 Yin <yinlei2@...ovo.com>,
"kbusch@...nel.org" <kbusch@...nel.org>,
"axboe@...com" <axboe@...com>, "hch@....de" <hch@....de>
Cc: "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"cybeyond@...mail.com" <cybeyond@...mail.com>
Subject: Re: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for
nvme multipath
On 3/21/23 12:50, Lei Lei2 Yin wrote:
> From b134e7930b50679ce48e5522ddd37672b1802340 Mon Sep 17 00:00:00 2001
> From: Lei Yin <yinlei2@...ovo.com>
> Date: Tue, 21 Mar 2023 16:09:08 +0800
> Subject: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme
> multipath
>
> When blk_queue_split works in nvme_ns_head_submit_bio, input bio will be
> splited to two bios. If parent bio is completed first, and the bi_disk
> in parent bio is kfreed by nvme_free_ns, child will access this freed
> bi_disk in bio_endio. This will trigger heap-use-after-free or null
> pointer oops.
Can you explain further? It is unclear to me how we can delete the ns
gendisk
>
> The following is kasan report:
>
> BUG: KASAN: use-after-free in bio_endio+0x477/0x500
> Read of size 8 at addr ffff888106f2e3a8 by task kworker/1:1H/241
>
> CPU: 1 PID: 241 Comm: kworker/1:1H Kdump: loaded Tainted: G O
> 5.10.167 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> Workqueue: kblockd nvme_requeue_work [nvme_core]
> Call Trace:
> dump_stack+0x92/0xc4
> ? bio_endio+0x477/0x500
> print_address_description.constprop.7+0x1e/0x230
> ? record_print_text.cold.40+0x11/0x11
> ? _raw_spin_trylock_bh+0x120/0x120
> ? blk_throtl_bio+0x225/0x3050
> ? bio_endio+0x477/0x500
> ? bio_endio+0x477/0x500
> kasan_report.cold.9+0x37/0x7c
> ? bio_endio+0x477/0x500
> bio_endio+0x477/0x500
> nvme_ns_head_submit_bio+0x950/0x1130 [nvme_core]
> ? nvme_find_path+0x7f0/0x7f0 [nvme_core]
> ? __kasan_slab_free+0x11a/0x150
> ? bio_endio+0x213/0x500
> submit_bio_noacct+0x2a4/0xd10
> ? _dev_info+0xcd/0xff
> ? _dev_notice+0xff/0xff
> ? blk_queue_enter+0x6c0/0x6c0
> ? _raw_spin_lock_irq+0x81/0xd5
> ? _raw_spin_lock+0xd0/0xd0
> nvme_requeue_work+0x144/0x18c [nvme_core]
> process_one_work+0x878/0x13e0
> worker_thread+0x87/0xf70
> ? __kthread_parkme+0x8f/0x100
> ? process_one_work+0x13e0/0x13e0
> kthread+0x30f/0x3d0
> ? kthread_parkme+0x80/0x80
> ret_from_fork+0x1f/0x30
>
> Allocated by task 52:
> kasan_save_stack+0x19/0x40
> __kasan_kmalloc.constprop.11+0xc8/0xd0
> __alloc_disk_node+0x5c/0x320
> nvme_alloc_ns+0x6e9/0x1520 [nvme_core]
> nvme_validate_or_alloc_ns+0x17c/0x370 [nvme_core]
> nvme_scan_work+0x2d4/0x4d0 [nvme_core]
> process_one_work+0x878/0x13e0
> worker_thread+0x87/0xf70
> kthread+0x30f/0x3d0
> ret_from_fork+0x1f/0x30
>
> Freed by task 54:
> kasan_save_stack+0x19/0x40
> kasan_set_track+0x1c/0x30
> kasan_set_free_info+0x1b/0x30
> __kasan_slab_free+0x108/0x150
> kfree+0xa7/0x300
> device_release+0x98/0x210
> kobject_release+0x109/0x3a0
> nvme_free_ns+0x15e/0x1f7 [nvme_core]
> nvme_remove_namespaces+0x22f/0x390 [nvme_core]
> nvme_do_delete_ctrl+0xac/0x106 [nvme_core]
> process_one_work+0x878/0x13e0
> worker_thread+0x87/0xf70
> kthread+0x30f/0x3d0
> ret_from_fork+0x1f/0x30
>
> Signed-off-by: Lei Yin <yinlei2@...ovo.com>
> ---
> drivers/nvme/host/nvme.h | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index c3e4d9b6f9c0..b441c5ce4157 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -749,8 +749,17 @@ static inline void nvme_trace_bio_complete(struct request *req,
> {
> struct nvme_ns *ns = req->q->queuedata;
>
> - if ((req->cmd_flags & REQ_NVME_MPATH) && req->bio)
> + if ((req->cmd_flags & REQ_NVME_MPATH) && req->bio) {
> trace_block_bio_complete(ns->head->disk->queue, req->bio);
> +
> + /* Point bio->bi_disk to head disk.
> + * This bio maybe as other bio's parent in bio chain. If this bi_disk
> + * is kfreed by nvme_free_ns, other bio may get this bio by __bio_chain_endio
> + * in bio_endio, and access this bi_disk. This will trigger heap-use-after-free
> + * or null pointer oops.
> + */
> + req->bio->bi_disk = ns->head->disk;
> + }
This is absolutely the wrong place to do this. This is a tracing
function, it should not have any other logic.
What tree is this against anyways? There is no bi_disk in struct bio
anymore.
Powered by blists - more mailing lists