lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <418bfbe4bfb3f04e805af8fa667144f148787aeb.camel@mediatek.com>
Date: Wed, 30 Jul 2025 12:55:12 +0000
From: Peter Wang (王信友) <peter.wang@...iatek.com>
To: "beanhuo@...ron.com" <beanhuo@...ron.com>, "avri.altman@....com"
	<avri.altman@....com>, "neil.armstrong@...aro.org"
	<neil.armstrong@...aro.org>, "quic_cang@...cinc.com" <quic_cang@...cinc.com>,
	"quic_nitirawa@...cinc.com" <quic_nitirawa@...cinc.com>,
	"quic_nguyenb@...cinc.com" <quic_nguyenb@...cinc.com>, "bvanassche@....org"
	<bvanassche@....org>, "quic_ziqichen@...cinc.com"
	<quic_ziqichen@...cinc.com>, "luca.weiss@...rphone.com"
	<luca.weiss@...rphone.com>, "konrad.dybcio@....qualcomm.com"
	<konrad.dybcio@....qualcomm.com>, "mani@...nel.org" <mani@...nel.org>,
	"martin.petersen@...cle.com" <martin.petersen@...cle.com>,
	"quic_rampraka@...cinc.com" <quic_rampraka@...cinc.com>,
	"junwoo80.lee@...sung.com" <junwoo80.lee@...sung.com>
CC: "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	Tze-nan Wu (吳澤南) <Tze-nan.Wu@...iatek.com>,
	"linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
	"manivannan.sadhasivam@...aro.org" <manivannan.sadhasivam@...aro.org>,
	"alim.akhtar@...sung.com" <alim.akhtar@...sung.com>,
	"James.Bottomley@...senPartnership.com"
	<James.Bottomley@...senPartnership.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] scsi: ufs: core: Don't perform UFS clkscale if host
 asyn scan in progress

On Tue, 2025-07-29 at 11:02 +0800, Ziqi Chen wrote:
> 
> 
> 
> Hi Peter,
> 
> I Don't think the dependence between CPU2 and CPU3 would happen.
> 
> CPU2:
> __mutex_lock_common+0x1dc/0x371c  -> (Waiting &q->sysfs_lock)
> mutex_lock_nested+0x2c/0x38
> blk_mq_realloc_hw_ctxs+0x94/0x9cc
> blk_mq_init_allocated_queue+0x31c/0x1020
> blk_mq_alloc_queue+0x130/0x214
> scsi_alloc_sdev+0x708/0xad4
> scsi_probe_and_add_lun+0x20c/0x27b4
> 
> CPU3:
> pus_read_lock+0x54/0x1e8 -> ( Waiting cpu_hotplug_lock)
> __cpuhp_state_add_instance+0x24/0x54
> blk_mq_alloc_and_init_hctx+0x940/0xbec
> blk_mq_realloc_hw_ctxs+0x290/0x9cc  -> (holding &q->sysfs_lock)
> blk_mq_init_allocated_queue+0x31c/0x1020
> __blk_mq_alloc_disk+0x138/0x2b0
> loop_add+0x2ac/0x840
> loop_init+0xe8/0x10c
> 

Hi Ziqi,

This is a warning, and it may not necessarily occur.
However, once this warning is detected, lockdep will stop,
which makes subsequent debugging more difficult.


> As my understanding, on single sdev , alloc_disk() and alloc_queue()
> is synchronous. On multi sdev , they hold different &q->sysfs_lock
> as they would be allocated different request_queue.
> 
> In addition to above , if you check the latest version, the function
> blk_mq_realloc_hw_ctxs has been changed many times recently. It
> doesn't
> hold &q->sysfs_lock any longer.
> 
> https://lore.kernel.org/all/20250304102551.2533767-5-nilay@linux.ibm.com/
> 
> -> use &q->elevator_lock instead of  &q->sysfs_lock.
> 
> https://lore.kernel.org/all/20250403105402.1334206-1-ming.lei@redhat.com/
> 
> -> Don't use &q->elevator_lock in blk_mq_init_allocated_queue
> context.
> 

We will further check these two patches to see if this issue
can be avoided. However, introducing new locks and increasing
systemic issues is still something we should try to avoid as 
much as possible.


> 
> 
> I have also considered this. you can see my old version of this patch
> (patch V2), I moved ufshcd_devfreq_init() out of ufshcd_add_lus().
> But due to ufshcd_add_lus() is async, even through move it out , we
> still can not ensure clock scaling be triggered after all lUs probed.
> 
> BRs
> Ziqi
> > 

I remember that I suggested in v3 that, although checking
hba->luns_avail in ufshcd_device_configure is a bit strange,
it was already strange that ufshcd_add_lus was calling 
ufshcd_devfreq_init in the first place.
However, in theory, this issue should still be solvable
without using a lock.
Another idea is to only start ufshcd_devfreq_init 
when shost->async_scan = 0.

Thanks.
Peter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ