lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250519073814.167264-1-bo.ye@mediatek.com>
Date: Mon, 19 May 2025 15:38:13 +0800
From: Bo Ye <bo.ye@...iatek.com>
To: Alim Akhtar <alim.akhtar@...sung.com>, Avri Altman <avri.altman@....com>,
	Bart Van Assche <bvanassche@....org>, "James E.J. Bottomley"
	<James.Bottomley@...senPartnership.com>, "Martin K. Petersen"
	<martin.petersen@...cle.com>, Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
CC: <xiujuan.tan@...iatek.com>, Qilin Tan <qilin.tan@...iatek.com>, Bosser Ye
	<bo.ye@...iatek.com>, <linux-scsi@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
	<linux-mediatek@...ts.infradead.org>
Subject: [PATCH] scsi: ufs: preventing bus hang crash during emergency power off

From: Qilin Tan <qilin.tan@...iatek.com>

When kernel_power_off is called directly without freezing userspace,
it may cause UFS crashes:

Callback:
    ...... 0xBFFFFFC080C6156C()
    vmlinux readl() + 52
    vmlinux ufshcd_add_command_trace() + 552
    vmlinux ufshcd_send_command() + 84

When kernel_power_off is executed, ufshcd_wl_shutdown is also called
to turn off the UFS reference clock, VCC, and VCCQ. If I/O requests
are still being sent to the UFS host and accessing the interrupt
status register at this time, AP read timeouts may occur, causing bus
hang crashes.

The root cause is that scsi_device_quiesce and blk_mq_freeze_queue
only drain the requests in the request queue but don't guarantee that
all requests have been dispatched to the UFS host and completed.
Requests may remain pending in the hardware dispatch queue and be
rescheduled later. If the UFS reference clock has already been turned
off at this point, a bus hang crash will occur.

Example of the race condition:
Thread 1                                                   Thread 2
kernel_power_off
-> ufshcd_wl_shutdown
 -> scsi_device_quiesce(sdev)
  -> blk_mq_freeze_queue(q)
   -> blk_mq_run_hw_queue(htx, false)
    -> blk_mq_delay_run_hw_queue(hctx, 0)                blk_mq_run_work_fn
 -> ufshcd_suspend(hba) // disable ref clk                -> blk_mq_dispatch_rq_list
                                                           -> blk_mq_run_hw_queue()
                                                            -> ufshcd_send_command()
                                                             -> ufshcd_add_command_trace()
                                                              -> ufshcd_readl(hba, REG_INTERRUPT_STATUS)

When Thread-2's dispatch request is delayed due to heavy CPU load,
the interrupt status register may be read after the reference clock
is disabled, resulting in a bus hang crash.

To avoid this issue, call ufshcd_wait_for_doorbell_clr to wait until
all requests are processed before disabling the reference clock.

Signed-off-by: Qilin Tan <qilin.tan@...iatek.com>
Signed-off-by: Bosser Ye <bo.ye@...iatek.com>
---
 drivers/ufs/core/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7735421e3991..a1013aea8e90 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -10262,6 +10262,7 @@ static void ufshcd_wl_shutdown(struct device *dev)
 		scsi_device_set_state(sdev, SDEV_OFFLINE);
 		mutex_unlock(&sdev->state_mutex);
 	}
+	ufshcd_wait_for_doorbell_clr(hba, 5 * USEC_PER_SEC);
 	__ufshcd_wl_suspend(hba, UFS_SHUTDOWN_PM);
 
 	/*
-- 
2.17.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ