lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <573C3283.1040606@rock-chips.com>
Date:	Wed, 18 May 2016 17:14:43 +0800
From:	Shawn Lin <shawn.lin@...k-chips.com>
To:	Doug Anderson <dianders@...omium.org>
Cc:	shawn.lin@...k-chips.com, Jaehoon Chung <jh80.chung@...sung.com>,
	Seungwon Jeon <tgih.jun@...sung.com>,
	Ulf Hansson <ulf.hansson@...aro.org>,
	Alim Akhtar <alim.akhtar@...sung.com>,
	Sonny Rao <sonnyrao@...omium.org>,
	Heiko Stuebner <heiko@...ech.de>,
	Alexandru Stan <amstan@...omium.org>,
	Javier Martinez Canillas <javier.martinez@...labora.co.uk>,
	"open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	"linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mmc: dw_mmc: Consider HLE errors to be data and command
 errors

Hi

On 2016-5-18 12:12, Doug Anderson wrote:
> Hi,
>
> On Tue, May 17, 2016 at 6:59 PM, Shawn Lin
> <shawn.lin@...nel-upstream.org> wrote:
>> Could you try this patch to see if you can still find HLE?
>>
>> @@ -2356,12 +2356,22 @@ static void dw_mci_cmd_interrupt(struct dw_mci
>> *host, u32 status)
>>   static void dw_mci_handle_cd(struct dw_mci *host)
>>   {
>>          int i;
>> +       int present;
>>
>>          for (i = 0; i < host->num_slots; i++) {
>>                  struct dw_mci_slot *slot = host->slot[i];
>>
>>                  if (!slot)
>>                          continue;
>>
>> +               present = !(mci_readl(slot->host, CDETECT) & (1 <<
>> slot->id));
>> +               if (present)
>> +                       set_bit(DW_MMC_CARD_PRESENT, &slot->flags);
>> +               else
>> +                       clear_bit(DW_MMC_CARD_PRESENT, &slot->flags);
>
> No, because we don't use the builtin card detect on veyron.  ;)
>
> We use GPIO card detect because we didn't like the way JTAG and SD
> interacted.  Also on rk3288 the builtin card detect line had the wrong
> voltage domain (you couldn't detect a card when the IO lines were
> powered off).  The builtin card detect line is always driven low on
> veyron.

Okay, I see.

>
>
> I'm nearly certain that the root cause of my HLE errors is actually
> related to the same problem addressed by the commit 7c5209c315ea
> ("mmc: core: Increase delay for voltage to stabilize from 3.3V to
> 1.8V").  I think that on minnie we're still on the hairy edge and
> sometimes the line doesn't transition fast enough.

Things are not so simple from your details.

I was not enabling SD3.0 support, then I also found HLE sometimes.
So it seems commit 7c5209c315ea does not contibute to this phenomenon.

The scenario looks like:
remove sd-card -> mmc_sd_detect -> send status(CMD13) ->power_off ->
set_ios -> setup_bus -> disabled clk , then HLE irq storm coming

 From the code of dw_mci_prepare_command:
SDMMC_CMD_PRV_DAT_WAIT will not be used for CMD13, so we don't
wait_busy here, then cmd code is loding into queue of dw_mmc but
still failing send out because it's in busy?

With my patch, things go well:
remove sd-card -> clear bit of DW_MMC_CARD_PRESENT  -> send
status(CMD13) return directly -> power_off -> set_ios -> setup_bus -> 
disable clk

So why should we allow inquiry of card status if we sure the card is
removed? I mean no any further cmds should be delivered.

And another question: should we wait busy for cmd13?

>
> It appears that increasing this to 30ms avoids the HLE errors.
>
> I _think_ I can actually fully fix this properly by temporarily
> engaging the internal pull-ups while the voltage switch is happening.
> This will bleed away the voltage just a little bit faster (since lines
> are driven low here).  I'll try to confirm that.
>
>
> In any case, it seems like we should take this patch since (without
> this patch) the failure case when you get HLE errors is that the
> interrupt controller fires over and over again (with no printouts) and
> your system stalls with no error messages.

Sure, at least we need to address this irq storm...

>
> -Doug
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ