[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dce1e8bd-c21c-202f-3e56-70be6e743a18@collabora.com>
Date: Thu, 9 Dec 2021 10:12:29 +0100
From: AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>
To: Yong Wu <yong.wu@...iatek.com>,
Krzysztof Kozlowski <krzysztof.kozlowski@...onical.com>
Cc: Krzysztof Kozlowski <krzk@...nel.org>,
Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Robin Murphy <robin.murphy@....com>,
Tomasz Figa <tfiga@...omium.org>,
linux-mediatek@...ts.infradead.org, srv_heupstream@...iatek.com,
linux-kernel@...r.kernel.org, devicetree@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux-foundation.org, youlin.pei@...iatek.com,
anan.sun@...iatek.com, lc.kan@...iatek.com, yi.kuo@...iatek.com,
anthony.huang@...iatek.com, Rob Herring <robh+dt@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>
Subject: Re: [PATCH 3/4] memory: mtk-smi: Add sleep ctrl function
Il 08/12/21 03:42, Yong Wu ha scritto:
> On Tue, 2021-12-07 at 13:16 +0100, AngeloGioacchino Del Regno wrote:
>> Il 07/12/21 13:10, Yong Wu ha scritto:
>>> On Tue, 2021-12-07 at 09:56 +0100, AngeloGioacchino Del Regno
>>> wrote:
>>>> Il 07/12/21 07:24, Yong Wu ha scritto:
>>>>> Hi AngeloGioacchino,
>>>>>
>>>>> Thanks for your review.
>>>>>
>>>>> On Mon, 2021-12-06 at 16:08 +0100, AngeloGioacchino Del Regno
>>>>> wrote:
>>>>>> Il 03/12/21 07:40, Yong Wu ha scritto:
>>>>>>> sleep control means that when the larb go to sleep, we
>>>>>>> should
>>>>>>> wait
>>>>>>> a bit
>>>>>>> until all the current commands are finished. thus, when the
>>>>>>> larb
>>>>>>> runtime
>>>>>>> suspend, we need enable this function to wait until all the
>>>>>>> existed
>>>>>>> command are finished. when the larb resume, just disable
>>>>>>> this
>>>>>>> function.
>>>>>>> This function only improve the safe of bus. Add a new flag
>>>>>>> for
>>>>>>> this
>>>>>>> function. Prepare for mt8186.
>>>>>>>
>>>>>>> Signed-off-by: Anan Sun <anan.sun@...iatek.com>
>>>>>>> Signed-off-by: Yong Wu <yong.wu@...iatek.com>
>>>>>>> ---
>>>>>>> drivers/memory/mtk-smi.c | 39
>>>>>>> +++++++++++++++++++++++++++++++++++----
>>>>>>> 1 file changed, 35 insertions(+), 4 deletions(-)
>>>>>
>>>>> [snip]
>>>>>
>>>>>>> static int __maybe_unused mtk_smi_larb_suspend(struct
>>>>>>> device
>>>>>>> *dev)
>>>>>>> {
>>>>>>> struct mtk_smi_larb *larb =
>>>>>>> dev_get_drvdata(dev);
>>>>>>> + int ret = 0;
>>>>>>> +
>>>>>>> + if (MTK_SMI_CAPS(larb->larb_gen->flags_general,
>>>>>>> MTK_SMI_FLAG_SLEEP_CTL))
>>>>>>> + ret = mtk_smi_larb_sleep_ctrl(dev, true);
>>>>>>
>>>>>> Sorry but what happens if SLP_PROT_RDY is not getting set
>>>>>> properly?
>>>>>> From what I can understand in the commit description that
>>>>>> you
>>>>>> wrote,
>>>>>> if we reach
>>>>>> the timeout, then the LARB transactions are not over....
>>>>>>
>>>>>> I see that you are indeed returning a failure here, but you
>>>>>> are
>>>>>> also
>>>>>> turning off
>>>>>> the clocks regardless of whether we get a failure or a
>>>>>> success;
>>>>>> I'm
>>>>>> not sure that
>>>>>> this is right, as this may leave the hardware in an
>>>>>> unpredictable
>>>>>> state (since
>>>>>> there were some more LARB transactions that didn't go
>>>>>> through),
>>>>>> leading to crashes
>>>>>> at system resume (or when retyring to suspend).
>>>>>
>>>>> Thanks for this question. In theory you are right. In this
>>>>> case,
>>>>> the
>>>>> bus already hang.
>>>>>
>>>>> We only printed a fail log in this patch. If this fail happens,
>>>>> we
>>>>> should request the master to check which case cause the larb
>>>>> hang.
>>>>>
>>>>> If the master has a good reason or limitation, the hang is
>>>>> expected, I
>>>>> think we have to add larb reset in this fail case: Reset the
>>>>> larb
>>>>> when
>>>>> the larb runtime resume.
>>>>>
>>>>
>>>> Think about the case in which the system gets resumed only
>>>> partially
>>>> due to a
>>>>
>>>> failure during resume of some driver, or due to a RTC or arch
>>>> timer
>>>> resume and
>>>> suspend right after... or perhaps during runtime suspend/resume
>>>> of
>>>> some devices.
>>>> In that case, we definitely want to avoid any kind of failure
>>>> point
>>>> that would
>>>> lead to a system crash, or any kind of user noticeable (or UX
>>>> disrupting) "strange
>>>> behavior".
>>>>
>>>> I think that we should make sure that the system suspends
>>>> cleanly,
>>>> instead of
>>>> patching up any possible leftover issue at resume time: if this
>>>> is
>>>> doable with
>>>> a LARB reset in suspend error case, that looks like being a good
>>>> option indeed.
>>>>
>>>> As a side note, thinking about UX, losing a little more time
>>>> during
>>>> suspend is
>>>> nothing really noticeable for the user... on the other hand,
>>>> spending
>>>> more time
>>>> during resume may be something noticeable to the user.
>>>> For this reason, I think that guaranteeing that the system
>>>> resumes as
>>>> fast as
>>>> possible is very important, which adds up to the need of
>>>> suspending
>>>> cleanly.
>>>
>>> Thanks for this comment. I will put it in the suspend when adding
>>> the
>>> reset. But I have no plan to add it in this version since I don't
>>> see
>>> the need for this right now. Maybe I should add a comment in the
>>> code
>>> for this.
>>>
>>
>> What I understand from your reply is that the reset is not trivial
>> work
>
> Yes. the reset bit is in different register regions, like mmsys,
> vdecsys. But the main problem is that I don't see why we need that. We
> never that problem.
>
The fact that we didn't get any "visible" problem with that is very good,
indeed, but having a recovery mechanism in place in the event that something
like that happens is going to be helpful in the future, as driver updates (either
to support new SoCs or Linux API changes, or new APIs) may produce unexpected
results sometimes and this will make sure that, despite there may be a problem,
the hardware will still work even before solving the producer of the issue.
Sometimes it may happen that solving an issue is nothing trivial, hence requires
a lot of time, and that's the main usefulness of that - and it's as useful as
your *great* idea of enabling SLP_PROT_RDY to check on the bus.
> The sleep ctrl function is just for the safety of the bus. If we have
> not it, It also should be ok...If not, the question is: why does the
> larb master device call pm_runtime_put before his HW finish the job?
>
I agree on the fact that calling pm_runtime_put before the HW finishes the
job is something that should *never* happen.
>> and needs quite some time to be done properly; in that case: yes,
>> please
>> add a TODO comment that explains the situation and the discussed
>> solution.
>>
>> Also, since this SLP_PROT_RDY flag seems to be very nice, just a
>> simple question: is this a new feature in the SMI IP of MT8186, or is
>> there anything similar that we may use on other SoCs, like 8183,
>> 8192, 8195, as a follow-up of this series?
>
> All the three SoC support this function. I expect the later SoC will
> support this. but the previous SoC has already MP... If someone has
> already tested ok after adding it for the previous SoC, I'm ok of
> course.
>
Thanks for the information. Again, this feature is very nice, so if it can
be used on any other SoC, it's going to be helpful in the future.
I will do some research.
Regards,
- Angelo
Powered by blists - more mailing lists