lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 31 Mar 2016 09:56:18 +0800
From:	Shawn Lin <shawn.lin@...k-chips.com>
To:	Russell King - ARM Linux <linux@....linux.org.uk>,
	Enric Balletbo Serra <eballetbo@...il.com>
Cc:	shawn.lin@...k-chips.com, Doug Anderson <dianders@...omium.org>,
	"linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Alim Akhtar <alim.akhtar@...il.com>,
	Jaehoon Chung <jh80.chung@...sung.com>,
	Ulf Hansson <ulf.hansson@...aro.org>,
	Alim Akhtar <alim.akhtar@...sung.com>,
	Sonny Rao <sonnyrao@...omium.org>,
	Andrew Bresticker <abrestic@...omium.org>,
	Heiko Stuebner <heiko@...ech.de>,
	Addy Ke <addy.ke@...k-chips.com>,
	Alexandru Stan <amstan@...omium.org>,
	Chris Zhong <zyw@...k-chips.com>,
	Caesar Wang <wxt@...k-chips.com>,
	Javier Martinez Canillas <javier@....samsung.com>
Subject: Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors

在 2016/3/31 1:26, Russell King - ARM Linux 写道:
> On Wed, Mar 30, 2016 at 07:16:18PM +0200, Enric Balletbo Serra wrote:
>> 2016-03-24 17:22 GMT+01:00 Russell King - ARM Linux <linux@....linux.org.uk>:
>>> On Thu, Mar 24, 2016 at 09:06:45AM -0700, Doug Anderson wrote:
>>>> Russell,
>>> ...
>>>> Presumably this is similar to what you saw: the host saw the CRC error
>>>> but the card knew nothing about it.  Sending the stop command during
>>>> this time confused the card.  Presumably the card was in transfer
>>>> state during this time?
>>>
>>> If the card was in transfer state for a command which expects a stop
>>> command, and that stop command was issued after the card entered
>>> the transfer state, then I'd expect the card to handle it... though
>>> there's always the firmware bug issue.
>>>
>>> If the card hadn't entered transfer state at the time the stop command
>>> was issued.. I think that's more likely to hit card firmware issues.
>>>
>>> With the tuning commands, there's another case you can hit though:
>>> the data transfer may have completed before you get around to sending
>>> the stop command.
>>>
>>> That's why, for sdhci, I came to the conclusion that waiting for the
>>> data transfer to complete or timeout was the best solution for SDHCI.
>>>
>>
>> In fact I only saw the problem with dw_mmc-exynos, on dw_mmc-rockchip
>> it doesn't happen because it enables the DW_MCI_QUIRK_BROKEN_DTO
>> behaviour. What does this is use a kernel timer to signal when DTO
>> interrupt does NOT come. Note that if I disable this quirk I can also
>> saw the problem on rockchip.
>>
>>> Maybe, if sending a STOP command does cause card firmware issues, then:
>>>
>>> 1) it provides evidence that trying to send a stop command on response
>>>     CRC error is the wrong thing to do (it was talked about making SDHCI
>>>     do this.)
>>>
>>
>> Seems the same here, so guess is the wrong thing to do.
>>
>>> 2) it suggests that the solution I came up with for SDHCI is the better
>>>     solution, rather than trying to immediately recover the situation by
>>>     sending a STOP command.
>>>
>>
>> I'm wondering if just enable this quirk on exynos too is the proper
>> solution. Unfortunately I don't have enough documentation to check
>> differences between those controllers.
>> Also will really help have access to some hardware that uses
>> dw_mmc-pltfm to check if, like on exynos, same issue is triggered.
>> Anyone with the hardware who can do some tests?
>
> I'd really suggest that the dw-mmc folk place a moritorium on quirk
> flags, and instead deal with situations like this without resorting
> to this kind of thing.
>

Some quirks and some callbacks have been cleaned in Jaehoon's repo,and
still some are going to removed. Finally we do plan to turn dw_mmc core
into a pure library..

> sdhci is a good example why the quirk flag approach is totally wrong,
> and shows that it leads to an unmaintainable mess.  If dw-mmc people
> don't want the driver to decend into the same state that sdhci is,
> then things like this should not be quirks.  sdhci already has a
> long-term moritorium on quirk flags until the resulting mess has been
> cleaned up.
>
> The danger that quirk flags cause is also highlighted in your mail:
> it's very likely that this _isn't_ a host controller issue at all,

Two issues found by dw_mmc-rockchip part,
(1) need reset idma when switching between fifo-transfer and
idma-transfer. When biu:ciu > 1:6, idma internal fsm take a risk of
a race condition to maintain its fifo lookup pointer. It can be very
easy reproduce by seting biu:ciu > 1:6.. Common bug for dw_mmc! But 
unfortunately these details was missing in the commit msg.

(2) Missing DTO/DRTO; I missed the thread for this topic, so I need to
reproduce it by setting a simple C model code. I can't say more
currently until we can find a way to easily reproduce it. But I guess
it's NOT a host issue....since I slightly glance at the TMOUT reg at 
dw_mmc databook and find a software timer requirement:

31:8 data_timeout 0xffffff
Value for card Data Read Timeout; same value also used for Data
Starvation by Host timeout. The timeout counter is started only after 
thecard clock is stopped. Value is in number of card output clocks – 
cclk_out of selected card.

Note: The software timer should be used if the timeout value is in the 
order of 100 ms. In this case, read data timeout interrupt needs to be 
disabled.

> but a MMC protocol issue or a card issue - and the behaviour required
> here is not specific to any particular host controller.  The problem
> with having a quirk flag for it is that you end up with some hosts
> enabling it, and other hosts having it disabled only because they
> haven't yet tripped over the issue.
>


-- 
Best Regards
Shawn Lin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ