[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c029b64cb4d78e7624bc896f9c9f16d@codeaurora.org>
Date: Mon, 26 Oct 2020 14:43:53 +0800
From: Can Guo <cang@...eaurora.org>
To: Jaegeuk Kim <jaegeuk@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
linux-f2fs-devel@...ts.sourceforge.net, kernel-team@...roid.com,
alim.akhtar@...sung.com, avri.altman@....com, bvanassche@....org
Subject: Re: [PATCH v3 1/5] scsi: ufs: atomic update for clkgating_enable
On 2020-10-26 14:13, Jaegeuk Kim wrote:
> On 10/26, Can Guo wrote:
>> On 2020-10-24 23:06, Jaegeuk Kim wrote:
>> > From: Jaegeuk Kim <jaegeuk@...gle.com>
>> >
>> > When giving a stress test which enables/disables clkgating, we hit
>> > device
>> > timeout sometimes. This patch avoids subtle racy condition to address
>> > it.
>> >
>> > If we use __ufshcd_release(), I've seen that gate_work can be called in
>> > parallel
>> > with ungate_work, which results in UFS timeout when doing hibern8.
>> > Should avoid it.
>> >
>>
>> I don't understand this comment. gate_work and ungate_work are queued
>> on
>> an ordered workqueue and an ordered workqueue executes at most one
>> work item
>> at any given time in the queued order. How can the two run in
>> parallel?
>
> When I hit UFS stuck, I saw this by clkgating tracepoint.
>
> - REQ_CLK_OFF
> - CLKS_OFF
> - REQ_CLK_OFF
> - REQ_CLKS_ON
> ..
>
I don't see how can you tell that the two works are running in parallel
just from above trace. May I know what is the exact error by "UFS
timeout
when doing hibern8"?
By using __ufshcd_release() here, I do see one potential issue if your
test
quickly toggles on/off of clk_gating - disable it, enable it, disable it
and
enable it, which will cause that __ufshcd_release() being called twice,
meaning
we queue two gate_works back to back. So can you try below code and let
me know
if it helps or not? I am OK with your current change, but I would like
to
understand the problem. Thanks.
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 1791bce..3eee438 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2271,6 +2271,8 @@ static void ufshcd_gate_work(struct work_struct
*work)
unsigned long flags;
spin_lock_irqsave(hba->host->host_lock, flags);
+ if (hba->clk_gating.state == CLKS_OFF)
+ goto rel_lock;
/*
* In case you are here to cancel this work the gating state
* would be marked as REQ_CLKS_ON. In this case save time by
Regards,
Can Guo.
> By using active_req, I don't see any problem.
>
>>
>> Thanks,
>>
>> Can Guo.
>>
>> > Signed-off-by: Jaegeuk Kim <jaegeuk@...gle.com>
>> > ---
>> > drivers/scsi/ufs/ufshcd.c | 12 ++++++------
>> > 1 file changed, 6 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> > index b8f573a02713..e0b479f9eb8a 100644
>> > --- a/drivers/scsi/ufs/ufshcd.c
>> > +++ b/drivers/scsi/ufs/ufshcd.c
>> > @@ -1807,19 +1807,19 @@ static ssize_t
>> > ufshcd_clkgate_enable_store(struct device *dev,
>> > return -EINVAL;
>> >
>> > value = !!value;
>> > +
>> > + spin_lock_irqsave(hba->host->host_lock, flags);
>> > if (value == hba->clk_gating.is_enabled)
>> > goto out;
>> >
>> > - if (value) {
>> > - ufshcd_release(hba);
>> > - } else {
>> > - spin_lock_irqsave(hba->host->host_lock, flags);
>> > + if (value)
>> > + hba->clk_gating.active_reqs--;
>> > + else
>> > hba->clk_gating.active_reqs++;
>> > - spin_unlock_irqrestore(hba->host->host_lock, flags);
>> > - }
>> >
>> > hba->clk_gating.is_enabled = value;
>> > out:
>> > + spin_unlock_irqrestore(hba->host->host_lock, flags);
>> > return count;
>> > }
Powered by blists - more mailing lists