linux-kernel - Re: [PATCH 1/2] mmc: mmci: enable MMC_CAP_NEED_RSP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1c1814dc-f87b-ef5c-24b4-b9a6ec570dbc@foss.st.com>
Date:   Mon, 8 Feb 2021 13:16:08 +0100
From:   Yann GAUTIER <yann.gautier@...s.st.com>
To:     Ulf Hansson <ulf.hansson@...aro.org>
CC:     Russell King <linux@...linux.org.uk>,
        Linus Walleij <linus.walleij@...aro.org>,
        <ludovic.barre@...s.st.com>,
        Marek Vašut <marex@...x.de>,
        "linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] mmc: mmci: enable MMC_CAP_NEED_RSP_BUSY

On 2/5/21 1:19 PM, Yann GAUTIER wrote:
> On 2/5/21 10:53 AM, Ulf Hansson wrote:
>> - trimmed cc-list
>>
>> On Thu, 4 Feb 2021 at 13:08, <yann.gautier@...s.st.com> wrote:
>>>
>>> From: Yann Gautier <yann.gautier@...s.st.com>
>>>
>>> To properly manage commands awaiting R1B responses, the capability
>>> MMC_CAP_NEED_RSP_BUSY is enabled in mmci driver, for variants that
>>> manage busy detection.
>>> This R1B management needs both the flags MMC_CAP_NEED_RSP_BUSY and
>>> MMC_CAP_WAIT_WHILE_BUSY to be enabled together.
>>
>> Would it be possible for you to share a little bit more about the
>> problem? Like under what circumstances does things screw up?
>>
>> Is the issue only occurring when the cmd->busy_timeout becomes larger
>> than host->max_busy_timeout. Or even in other cases?
>>
>>>
>>> Signed-off-by: Yann Gautier <yann.gautier@...s.st.com>
>>> ---
>>>   drivers/mmc/host/mmci.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
>>> index 1bc674577ff9..bf6971fdd1a6 100644
>>> --- a/drivers/mmc/host/mmci.c
>>> +++ b/drivers/mmc/host/mmci.c
>>> @@ -2148,7 +2148,7 @@ static int mmci_probe(struct amba_device *dev,
>>>                  if (variant->busy_dpsm_flag)
>>>                          mmci_write_datactrlreg(host,
>>>                                                 
>>> host->variant->busy_dpsm_flag);
>>> -               mmc->caps |= MMC_CAP_WAIT_WHILE_BUSY;
>>> +               mmc->caps |= MMC_CAP_WAIT_WHILE_BUSY | 
>>> MMC_CAP_NEED_RSP_BUSY;
>>
>> This isn't correct as the ux500 (and likely also other legacy
>> variants) don't need this. I have tried it in the past and it works
>> fine for ux500 without MMC_CAP_NEED_RSP_BUSY.
>>
>> The difference is rather that the busy detection for stm32 variants
>> needs a corresponding HW busy timeout to be set (its
>> variant->busy_timeout flag is set). Perhaps we can use that
>> information instead?
>>
>> Note that, MMC_CAP_NEED_RSP_BUSY, means that cmd->busy_timeout will
>> not be set by the core for erase commands, CMD5 and CMD6.
>>
>> By looking at the code in mmci_start_command(), it looks like we will
>> default to a timeout of 10s, when cmd->busy_timeout isn't set. At
>> least for some erase requests, that won't be sufficient. Would it be
>> possible to disable the HW busy timeout in some way - and maybe use a
>> software timeout instead? Maybe I already asked Ludovic about this?
>> :-)
>>
>> BTW, did you check that the MMCIDATATIMER does get the correct value
>> set for the timer in mmci_start_command() and if
>> host->max_busy_timeout gets correctly set in
>> mmci_set_max_busy_timeout()?
>>
>> [...]
>>
>> Kind regards
>> Uffe
>>
> 
> Hi Ulf,
> 
> Thanks for the hints.
> I'll check all of that and get back with updated patches.
> 
> As I tried to explain in the cover letter and in reply to Adrian, I saw
> a freeze (BUSYD0) in test 37 during MMC_ERASE command  with 
> SECURE_ERASE_ARG, when running this test just after test 36 (or any 
> other write test). But maybe, as you said that's mostly a incorrect 
> timeout issue.
> 
> Regards,
> Yann

Hi,

I made some extra tests, and the timeout value set in MMCIDATATIMER 
correspond to the one computed:
card->ext_csd.erase_group_def is set to 1 in mmc_init_card()
In mmc_mmc_erase_timeout(), we have:
erase_timeout = card->ext_csd.hc_erase_timeout; // 300ms * 0x07 (for the 
eMMC card I have: THGBMDG5D1LBAIL
erase_timeout *= card->ext_csd.sec_erase_mult; // 0xDC
erase_timeout *= qty; // 32 (from = 0x1d0000, to = 0x20ffff)

This leads to a timeout of 14784000ms (~4 hours).
The max_busy_timeout is 86767ms.

After those 4 hours, I get this message:
mmc1: Card stuck being busy! __mmc_poll_for_busy

The second erase with MMC_ERASE_ARG finds an erase timeout of 67200ms, 
and uses R1B command.
But as the first erase failed, the DPSMACT is still enabled, the busy 
timeout doesn't seem to happen. Something may be missing in the error path.

Anyway, I'll push an update of the second patch of the series, and just 
drop this first one.


Regards,
Yann