[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6106113-7050-4099-8c6b-ec79b6b83d5f@oracle.com>
Date: Wed, 1 Mar 2023 15:15:03 -0600
From: Mike Christie <michael.christie@...cle.com>
To: zhongjinghua <zhongjinghua@...weicloud.com>,
zhongjinghua <zhongjinghua@...wei.com>, jejb@...ux.ibm.com,
martin.petersen@...cle.com
Cc: linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
yi.zhang@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH-next] scsi: fix use-after-free problem in
scsi_remove_target
On 2/28/23 9:40 PM, zhongjinghua wrote:
>> 在 2023/2/13 11:43, Zhong Jinghua 写道:
>>> From: Zhong Jinghua <zhongjinghua@...wei.com>
>>>
>>> A use-after-free problem like below:
>>>
>>> BUG: KASAN: use-after-free in scsi_target_reap+0x6c/0x70
>>>
>>> Workqueue: scsi_wq_1 __iscsi_unbind_session [scsi_transport_iscsi]
>>> Call trace:
>>> dump_backtrace+0x0/0x320
>>> show_stack+0x24/0x30
>>> dump_stack+0xdc/0x128
>>> print_address_description+0x68/0x278
>>> kasan_report+0x1e4/0x308
>>> __asan_report_load4_noabort+0x30/0x40
>>> scsi_target_reap+0x6c/0x70
>>> scsi_remove_target+0x430/0x640
>>> __iscsi_unbind_session+0x164/0x268 [scsi_transport_iscsi]
>>> process_one_work+0x67c/0x1350
>>> worker_thread+0x370/0xf90
>>> kthread+0x2a4/0x320
>>> ret_from_fork+0x10/0x18
>>>
>>> The problem is caused by a concurrency scenario:
>>>
>>> T0: delete target
>>> // echo 1 > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
>>> T1: logout
>>> // iscsiadm -m node --logout
>>>
>>> T0 T1
>>> sdev_store_delete
>>> scsi_remove_device
>>> device_remove_file
>>> __scsi_remove_device
>>> __iscsi_unbind_session
>>> scsi_remove_target
>>> spin_lock_irqsave
>>> list_for_each_entry
>>> scsi_target_reap // starget->reaf 1 -> 0
>>> kref_get(&starget->reap_ref);
>>> // warn use-after-free.
>>> spin_unlock_irqrestore
>>> scsi_target_reap_ref_release
>>> scsi_target_destroy
>>> ... // delete starget
>>> scsi_target_reap
>>> // UAF
>>>
>>> When T0 reduces the reference count to 0, but has not been released,
>>> T1 can still enter list_for_each_entry, and then kref_get reports UAF.
>>>
>>> Fix it by using kref_get_unless_zero() to check for a reference count of
>>> 0.
>>>
>>> Signed-off-by: Zhong Jinghua <zhongjinghua@...wei.com>
>>> ---
>>> drivers/scsi/scsi_sysfs.c | 12 +++++++++++-
>>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>>> index e7893835b99a..0ad357ff4c59 100644
>>> --- a/drivers/scsi/scsi_sysfs.c
>>> +++ b/drivers/scsi/scsi_sysfs.c
>>> @@ -1561,7 +1561,17 @@ void scsi_remove_target(struct device *dev)
>>> starget->state == STARGET_CREATED_REMOVE)
>>> continue;
>>> if (starget->dev.parent == dev || &starget->dev == dev) {
>>> - kref_get(&starget->reap_ref);
>>> +
>>> + /*
>>> + * If starget->reap_ref is reduced to 0, it means
>>> + * that other processes are releasing it and
>>> + * there is no need to delete it again
>>> + */
>>> + if (!kref_get_unless_zero(&starget->reap_ref)) {
>>> + spin_unlock_irqrestore(shost->host_lock, flags);
>>> + goto restart;
>>> + }
>>> +
Patch looks ok.
Is there another bug in the existing kref_get_unless_zero(&starget->reap_ref)
call in scsi_alloc_target?
I think scsi_alloc_target can find the target on the __targets list, and
it's call to kref_get_unless_zero will succeed if we are only above getting
our own ref (we have not done __scsi_remove_target and have not done the
scsi_target_reap call at the end of the function).
But if scsi_remove_target has set the target state to STARGET_REMOVE, the thread
that did scsi_alloc_target wouldn't be able to put the target into the correct state
(the scsi_target_add call will see the target state and return). So later if the
driver/transport class did scsi_remove_target again to remove the target that
the scsi_alloc_target call re-added, we see the target->state still in STARGET_REMOVE
and it won't get deleted.
Can we solve both issues at the same time?
Powered by blists - more mailing lists