linux-kernel - Re: [PATCH v4] scsi: ufs: core: Don't perform UFS clkscale if host asyn scan in progress

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <418bfbe4bfb3f04e805af8fa667144f148787aeb.camel@mediatek.com>
Date: Wed, 30 Jul 2025 12:55:12 +0000
From: Peter Wang (王信友) <peter.wang@...iatek.com>
To: "beanhuo@...ron.com" <beanhuo@...ron.com>, "avri.altman@....com"
	<avri.altman@....com>, "neil.armstrong@...aro.org"
	<neil.armstrong@...aro.org>, "quic_cang@...cinc.com" <quic_cang@...cinc.com>,
	"quic_nitirawa@...cinc.com" <quic_nitirawa@...cinc.com>,
	"quic_nguyenb@...cinc.com" <quic_nguyenb@...cinc.com>, "bvanassche@....org"
	<bvanassche@....org>, "quic_ziqichen@...cinc.com"
	<quic_ziqichen@...cinc.com>, "luca.weiss@...rphone.com"
	<luca.weiss@...rphone.com>, "konrad.dybcio@....qualcomm.com"
	<konrad.dybcio@....qualcomm.com>, "mani@...nel.org" <mani@...nel.org>,
	"martin.petersen@...cle.com" <martin.petersen@...cle.com>,
	"quic_rampraka@...cinc.com" <quic_rampraka@...cinc.com>,
	"junwoo80.lee@...sung.com" <junwoo80.lee@...sung.com>
CC: "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	Tze-nan Wu (吳澤南) <Tze-nan.Wu@...iatek.com>,
	"linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
	"manivannan.sadhasivam@...aro.org" <manivannan.sadhasivam@...aro.org>,
	"alim.akhtar@...sung.com" <alim.akhtar@...sung.com>,
	"James.Bottomley@...senPartnership.com"
	<James.Bottomley@...senPartnership.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] scsi: ufs: core: Don't perform UFS clkscale if host
 asyn scan in progress

On Tue, 2025-07-29 at 11:02 +0800, Ziqi Chen wrote:
> 
> 
> 
> Hi Peter,
> 
> I Don't think the dependence between CPU2 and CPU3 would happen.
> 
> CPU2:
> __mutex_lock_common+0x1dc/0x371c  -> (Waiting &q->sysfs_lock)
> mutex_lock_nested+0x2c/0x38
> blk_mq_realloc_hw_ctxs+0x94/0x9cc
> blk_mq_init_allocated_queue+0x31c/0x1020
> blk_mq_alloc_queue+0x130/0x214
> scsi_alloc_sdev+0x708/0xad4
> scsi_probe_and_add_lun+0x20c/0x27b4
> 
> CPU3:
> pus_read_lock+0x54/0x1e8 -> ( Waiting cpu_hotplug_lock)
> __cpuhp_state_add_instance+0x24/0x54
> blk_mq_alloc_and_init_hctx+0x940/0xbec
> blk_mq_realloc_hw_ctxs+0x290/0x9cc  -> (holding &q->sysfs_lock)
> blk_mq_init_allocated_queue+0x31c/0x1020
> __blk_mq_alloc_disk+0x138/0x2b0
> loop_add+0x2ac/0x840
> loop_init+0xe8/0x10c
> 

Hi Ziqi,

This is a warning, and it may not necessarily occur.
However, once this warning is detected, lockdep will stop,
which makes subsequent debugging more difficult.


> As my understanding, on single sdev , alloc_disk() and alloc_queue()
> is synchronous. On multi sdev , they hold different &q->sysfs_lock
> as they would be allocated different request_queue.
> 
> In addition to above , if you check the latest version, the function
> blk_mq_realloc_hw_ctxs has been changed many times recently. It
> doesn't
> hold &q->sysfs_lock any longer.
> 
> https://lore.kernel.org/all/20250304102551.2533767-5-nilay@linux.ibm.com/
> 
> -> use &q->elevator_lock instead of  &q->sysfs_lock.
> 
> https://lore.kernel.org/all/20250403105402.1334206-1-ming.lei@redhat.com/
> 
> -> Don't use &q->elevator_lock in blk_mq_init_allocated_queue
> context.
> 

We will further check these two patches to see if this issue
can be avoided. However, introducing new locks and increasing
systemic issues is still something we should try to avoid as 
much as possible.


> 
> 
> I have also considered this. you can see my old version of this patch
> (patch V2), I moved ufshcd_devfreq_init() out of ufshcd_add_lus().
> But due to ufshcd_add_lus() is async, even through move it out , we
> still can not ensure clock scaling be triggered after all lUs probed.
> 
> BRs
> Ziqi
> > 

I remember that I suggested in v3 that, although checking
hba->luns_avail in ufshcd_device_configure is a bit strange,
it was already strange that ufshcd_add_lus was calling 
ufshcd_devfreq_init in the first place.
However, in theory, this issue should still be solvable
without using a lock.
Another idea is to only start ufshcd_devfreq_init 
when shost->async_scan = 0.

Thanks.
Peter