[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250117114400.79792-1-wh1sper@zju.edu.cn>
Date: Fri, 17 Jan 2025 19:42:02 +0800
From: Haoran Zhang <wh1sper@....edu.cn>
To: michael.christie@...cle.com
Cc: mst@...hat.com,
jasowang@...hat.com,
pbonzini@...hat.com,
stefanha@...hat.com,
eperezma@...hat.com,
virtualization@...ts.linux.dev,
kvm@...r.kernel.org,
linux-kernel@...r.kernel.org,
Haoran Zhang <wh1sper@....edu.cn>
Subject: Re: Re: [PATCH] vhost/scsi: Fix improper cleanup in vhost_scsi_set_endpoint()
> I see now and can replicate it. I think there is a 2nd bug in
> vhost_scsi_set_endpoint related to all this where we need to
> prevent switching targets like this or else we'll leak some
> other refcounts. If 500140501e23be28's tpg number was 3 then
> we would overwrite the existing vs->vs_vhost_wwpn and never
> be able to release the refounts on the tpgs from 500140562c8936fa.
>
> I'll send a patchset to fix everything and cc you.
>
> Thanks for all the work you did testing and debugging this
> issue.
You are welcome. There is another bug I was about to report, but I'm not
sure whether I should create a new thread. I feel that the original design
of dynamically allocating new vs_tpgs in vhost_scsi_set_endpoint is not
intuitive, and copying TPGs before setting the target doesn't seem
logical. Since you are already refactoring the code, maybe I should post
it here so we can address these issues in one go.
[PATCH] vhost/scsi: Fix dangling pointer in vhost_scsi_set_endpoint()
Since commit 4f7f46d32c98 ("tcm_vhost: Use vq->private_data to indicate
if the endpoint is setup"), a dangling pointer issue has been introduced
in vhost_scsi_set_endpoint() when the host fails to reconfigure the
vhost-scsi endpoint. Specifically, this causes a UAF fault in
vhost_scsi_get_req() when the guest attempts to send an SCSI request.
In vhost_scsi_set_endpoint(), vhost_scsi->vs_tpg holds the same pointer
as vq->private_data. The code allocates a new vs_tpg array to hold the
TPGs and updates vq->private_data if a match is found between
vhost_scsi_list's tport_name and the target WWPN. However, the code
ignored the case where `match == 0` (i.e. the target endpoint does not
match any TPGs in vhost_scsi_list). In this scenario, it directly frees
the old vs_tpg and updates vhost_scsi->vs_tpg without modifying
vq->private_data, leaving all vq's backend pointer dangling. As a result,
subsequent requests from the guest will trigger a UAF fault on vs_tpg in
vhost_scsi_get_req(). Below is the KASAN report:
[ 68.606821] BUG: KASAN: slab-use-after-free in vhost_scsi_get_req+0xef/0x1f0
[ 68.607671] Read of size 8 at addr ffff8880087a1008 by task vhost-1440/1460
[ 68.608570]
[ 68.612070] Call Trace:
[ 68.612429] <TASK>
[ 68.612739] dump_stack_lvl+0x9e/0xd0
[ 68.613206] print_report+0xd1/0x670
[ 68.613711] ? __virt_addr_valid+0x54/0x250
[ 68.614232] ? kasan_complete_mode_report_info+0x6a/0x200
[ 68.614879] kasan_report+0xd6/0x120
[ 68.615329] ? vhost_scsi_get_req+0xef/0x1f0
[ 68.615869] ? vhost_scsi_get_req+0xef/0x1f0
[ 68.616406] __asan_load8+0x8b/0xe0
[ 68.616854] vhost_scsi_get_req+0xef/0x1f0
[ 68.617362] vhost_scsi_handle_vq+0x30f/0x15e0
[ 68.622248] ...
[ 68.622868] vhost_scsi_handle_kick+0x39/0x50
[ 68.623409] vhost_run_work_list+0xd9/0x120
[ 68.623939] ? __pfx_vhost_run_work_list+0x10/0x10
[ 68.624521] vhost_task_fn+0xf8/0x240
[ 68.629164] ...
[ 68.629705] ret_from_fork+0x5d/0x80
[ 68.630155] ? __pfx_vhost_task_fn+0x10/0x10
[ 68.630693] ret_from_fork_asm+0x1a/0x30
[ 68.631179] RIP: 0033:0x0
[ 68.631516] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 68.637405] </TASK>
[ 68.637670]
[ 68.637798] Allocated by task 1440:
[ 68.638124] kasan_save_stack+0x39/0x70
[ 68.638607] kasan_save_track+0x14/0x40
[ 68.638919] kasan_save_alloc_info+0x37/0x60
[ 68.639331] __kasan_kmalloc+0xc3/0xd0
[ 68.639625] __kmalloc_cache_noprof+0x186/0x3a0
[ 68.639971] vhost_scsi_ioctl+0x630/0xec0
[ 68.640392] __x64_sys_ioctl+0x126/0x160
[ 68.640755] x64_sys_call+0x11ad/0x25f0
[ 68.641076] do_syscall_64+0x7e/0x170
[ 68.641437] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 68.641887]
[ 68.642016] Freed by task 1461:
[ 68.642301] kasan_save_stack+0x39/0x70
[ 68.642622] kasan_save_track+0x14/0x40
[ 68.642958] kasan_save_free_info+0x3b/0x60
[ 68.643352] __kasan_slab_free+0x52/0x80
[ 68.643675] kfree+0x129/0x440
[ 68.643973] vhost_scsi_ioctl+0xd74/0xec0
[ 68.644290] __x64_sys_ioctl+0x126/0x160
[ 68.644589] x64_sys_call+0x11ad/0x25f0
[ 68.644939] do_syscall_64+0x7e/0x170
[ 68.645377] entry_SYSCALL_64_after_hwframe+0x76/0x7e
To address this issue, we need to prevent the `free(vs_tpg)` operation
from being triggered by the `match == 0` branch , either by moving the
free operation inside the if-clause or by adding a goto statement in the
else-clause. Here is an alternative patch commit.
Fixes: 4f7f46d32c98 ("tcm_vhost: Use vq->private_data to indicate if the endpoint is setup")
Signed-off-by: Haoran Zhang <wh1sper@....edu.cn>
---
drivers/vhost/scsi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 718fa4e0b31e..1e15eab530d7 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1775,6 +1775,7 @@ vhost_scsi_set_endpoint(struct vhost_scsi *vs,
ret = 0;
} else {
ret = -EEXIST;
+ goto undepend;
}
/*
--
2.43.0
Powered by blists - more mailing lists