linux-kernel - Re: [PATCH v1 2/2] mmc: sdhci: Use the SW timer when the HW timer cannot meet the timeout value required by the device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7fd4a22-65f6-d1c4-675c-5930452a1fea@intel.com>
Date:   Wed, 29 Sep 2021 15:38:00 +0300
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Bean Huo <huobean@...il.com>, Ulf Hansson <ulf.hansson@...aro.org>
Cc:     Bean Huo <beanhuo@...ron.com>, linux-mmc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/2] mmc: sdhci: Use the SW timer when the HW timer
 cannot meet the timeout value required by the device

On 29/09/2021 13:49, Bean Huo wrote:
> Hi Adrian,
> 
> On Tue, 2021-09-28 at 13:18 +0300, Adrian Hunter wrote:
>> On 25/09/2021 00:33, Bean Huo wrote:
>>> On Fri, 2021-09-24 at 16:26 +0300, Adrian Hunter wrote:
>>>> On 24/09/21 4:08 pm, Bean Huo wrote:
>>>>> On Fri, 2021-09-24 at 15:17 +0300, Adrian Hunter wrote:
>>>>>>>>>          sdhci_writeb(host, count,
>>>>>>>>> SDHCI_TIMEOUT_CONTROL);
>>>>>>>>> }
>>>>>>>>> The driver has detected that the hardware timer cannot
>>>>>>>>> meet
>>>>>>>>> the
>>>>>>>>> timeout
>>>>>>>>> requirements of the device, but we still use the
>>>>>>>>> hardware
>>>>>>>>> timer,
>>>>>>>>> which will
>>>>>>>>> allow potential timeout issuea . Rather than allowing a
>>>>>>>>> potential
>>>>>>>>> problem to exist, why can’t software timing be used to
>>>>>>>>> avoid
>>>>>>>>> this
>>>>>>>>> problem?
>>>>>>>> Timeouts aren't that accurate.  The maximum is assumed
>>>>>>>> still
>>>>>>>> to
>>>>>>>> work.
>>>>>>>> mmc->max_busy_timeout is used to tell the core what the
>>>>>>>> maximum
>>>>>>>> is.
>>>>>>> mmc->max_busy_timeout is still a representation of Host HW
>>>>>>> timer
>>>>>>> maximum timeout count, isn't it? 
>>>>>>
>>>>>> Not necessarily.  For SDHCI_QUIRK2_DISABLE_HW_TIMEOUT it
>>>>>> would be
>>>>>>
>>>>>> set to zero to indicate no maximum.
>>>>>
>>>>> yes, this is the purpose of the patch, for the host controller
>>>>> without
>>>>> quirk SDHCI_QUIRK2_DISABLE_HW_TIMEOUT, if the timeout count
>>>>> required by
>>>>> device is beyond the HW timer max count, we choose SW timer to
>>>>> avoid the HW timer timeout IRQ.
>>>>>
>>>>> I don't know if I get it correctly.
>>>>
>>>> Why can't drivers that want the behaviour just set the quirk?
>>>>
>>>> Drivers that do not work with the quirk, do not have to set it.
>>>
>>> Adrian,
>>>
>>> We cannot add this quirk to every host driver.
>>
>> I was suggesting only the ones for which it works.
>>
>>>  This is the difference
>>> on the device side.
>>
>> It is the host controller that has the problem, not the device.
>> Hence the quirk.
>>
>>> In addition, I don't know what the maximum hardware
>>> timer budget for each host is.
>>
>> mmc->max_busy_timeout is calculated by sdhci.c, or the driver can
>> override the maximum count via ->get_max_timeout_count() or the
>> driver
>> can override mmc->max_busy_timeout.
>>
>> With the quirk, sdhci.c will usually set mmc->max_busy_timeout to
>> zero.
>> That allows timeouts greater than the hardware can support, and then,
>> with the quirk, the driver will switch to a software timeout when
>> needed.
>>
> 
> 
> According to your above statement, do you mean that the eMMC host
> controller does not support the scenario where the data transmission
> timeout value required by the eMMC device is greater than the capacity
> of the eMMC host hardware data transmission timer? Unless the eMMC host
> vendor accepts their eMMC host accepts the use of SW timers in this
> case (adding quirks)?

Yes

>> However, that won't work for every host controller, because some do
>> not
>> provide a completion interrupt after the timeout, even if the timeout
>> interrupt is disabled.  
> 
> Do you mean that if we disable the hardware timer/timeout interrupt and
> use the software timer, the eMMC host controller will not trigger a
> completion interrupt?

Yes.  On some hardware, if the timeout interrupt is disabled, and the
operation takes longer than the hardware timeout value, then there is
no completion interrupt.  It is as if the timeout "happens" even if the
interrupt is disabled.

> Even before the SW timer expires, the data
> transfer between the host and the eMMC device is complete? Is this what
> you mean?

I'm not 100% sure about what happens if the operation completes
before the hardware timeout value, but I think it gets a completion
interrupt same as normal.

> 
> 
>> That means they should set mmc->max_busy_timeout
>> to the hardware value.  Hence the quirk is needed to tell the
>> difference.
>>
> 
> This means this quirk is for eMMC host can privode the completion
> interrupt in case HW timer is disabled?  

Yes

>>> Even if you use the same SOC, the
>>> maximum time budget on different platforms may be different.
>>
>> The mmc core calculates timeouts based on the relevant standards and
>> values provided by the device itself.
> 
> 
> Yes, but the eMMC standard does not define the maximum timeout value.
> Different eMMC will have different timeout values.
> 
>>
>>> Assume that the maximum timeout time supported by the hardware
>>> timer is
>>> 100 milliseconds
>>
>> I realise it is an example, but 100 milliseconds is a bit low. Legacy
>> host controllers have always had to deal with standard SD card and
>> MMC card timeouts.  SD card write timeout of 500 milliseconds for
>> instance.
> 
> 
> Yes, this is just an example. I have several platforms, they are TI,
> NXP, Intel and Qcom. The timeout time of the hardware timer is
> different, the greatest one is 1.3s, some are less than 500ms.
> 
>>
>>> but the maximum data transmission time required by
>>> the device is 150 milliseconds. In most cases, data transfer will
>>> not
>>> take so long. 150 is the maximum time under extreme conditions.
>>> This
>>> means that the device just needs to complete a data transfer within
>>>> 100ms and keep the data line busy. If we still use the HW timer,
>>>> it
>>> will trigger a DATA LINE timeout interrupt.
>>>
>>> This patch does not affect scenarios where the hardware timer meets
>>> the
>>> max data transmission time requirements of the device. It will
>>> still
>>> use the hardware timer. Only when the device changes, will it
>>> switch to
>>> using the SW timer.
>>
>> Which is what the quirk does.  So I am very confused why the quirk is
>> no goo
> 
> Because I don't know what the maximum volume of the real hardware timer
> of each host controller is.> And different hosts have different timer
> capacities.
> Meanwhile, the eMMC devices have different data
> transmission timeout between the different density/series as well.
> 
> Would you please confirm your three points above? If they are true, I
> agree we cannot disable hardware timers and use SW tiner on some
> platforms.

Yes, I expect the quirk will do what you need.  Have you tried it?