linux-kernel - Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1604412227.13152.11.camel@mtkswgap22>
Date:   Tue, 3 Nov 2020 22:03:47 +0800
From:   Stanley Chu <stanley.chu@...iatek.com>
To:     Can Guo <cang@...eaurora.org>
CC:     <asutoshd@...eaurora.org>, <nguyenb@...eaurora.org>,
        <hongwus@...eaurora.org>, <rnayak@...eaurora.org>,
        <linux-scsi@...r.kernel.org>, <kernel-team@...roid.com>,
        <saravanak@...gle.com>, <salyzyn@...gle.com>,
        "Alim Akhtar" <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Bean Huo <beanhuo@...ron.com>,
        "Bart Van Assche" <bvanassche@....org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt
 caused by ufshcd_hold()

Hi Can,

On Tue, 2020-11-03 at 18:01 +0800, Can Guo wrote:
> On 2020-11-03 15:07, Stanley Chu wrote:
> > Hi Can,
> > 
> > On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:
> >> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
> >> decreased back in ufshcd_ungate_work() in a paired way. However, if
> >> specific ufshcd_hold/release sequences are met, it is possible that
> >> scsi_block_reqs_cnt is increased twice but only one ungate work is
> >> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() 
> >> and
> > 
> > Just curious that how could this be possible? Would you have some 
> > failed
> > examples?
> > 
> 
> [1] One gate_work() is in the workqueue, not yet executed, now clk state 
> == REQ_CLKS_OFF.
> [2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) -> 
> active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate 
> work -> active_req-- -> return -EAGAIN.
> [3] Now gate_work() starts to run, but since the clk state is 
> REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail.
> [3] Someone calls ufshcd_hold(async == false) -> do something -> 
> ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that, 
> till now, ungate_work() is still in the work queue, not executed yet.
> [4] Now, if someone calls ufshcd_hold(), we will hit the issue.
> 
> Above sequence is a very common clk gate/ungate sequence. The issue
> is because ungate_work is queued but cannot be executed in time. In my
> case, I see the ungate_work is somehow delayed for about 150ms. This
> change has been tested by customers on multiple platforms. And you
> can tell from the code that it won't break anything. :)

Thanks so much for the details. Looks good to me.

Reviewed-by: Stanley Chu <stanley.chu@...iatek.com>

> 
> Thanks,
> 
> Can Guo.
> 
> >> ufshcd_ungate_work() in a paired way, increase it only if queue_work()
> >> returns true.
> >> 
> >> Signed-off-by: Can Guo <cang@...eaurora.org>
> >> Reviewed-by: Hongwu Su <hongwus@...eaurora.org>
> >> ---
> >>  drivers/scsi/ufs/ufshcd.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >> index 847f355..efa7d86 100644
> >> --- a/drivers/scsi/ufs/ufshcd.c
> >> +++ b/drivers/scsi/ufs/ufshcd.c
> >> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool 
> >> async)
> >>  		 */
> >>  		/* fallthrough */
> >>  	case CLKS_OFF:
> >> -		ufshcd_scsi_block_requests(hba);
> >>  		hba->clk_gating.state = REQ_CLKS_ON;
> >>  		trace_ufshcd_clk_gating(dev_name(hba->dev),
> >>  					hba->clk_gating.state);
> >> -		queue_work(hba->clk_gating.clk_gating_workq,
> >> -			   &hba->clk_gating.ungate_work);
> >> +		if (queue_work(hba->clk_gating.clk_gating_workq,
> >> +			       &hba->clk_gating.ungate_work))
> >> +			ufshcd_scsi_block_requests(hba);
> >>  		/*
> >>  		 * fall through to check if we should wait for this
> >>  		 * work to be done or not.
> > 
> > Thanks,
> > Stanley Chu