lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 Jun 2014 16:51:04 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Ulf Hansson <ulf.hansson@...aro.org>,
	Chris Ball <chris@...ntf.net>,
	Peter Maydell <peter.maydell@...aro.org>
Cc:	Johan Rudholm <jrudholm@...il.com>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	"Theodore Ts'o" <tytso@....edu>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On Wed, Jun 11, 2014 at 10:35 PM, John Stultz <john.stultz@...aro.org> wrote:
> Bisecting this points to: e7f3d22289e4307b3071cc18b1d8ecc6598c0be4
> (mmc: mmci: Handle CMD irq before DATA irq). Which I guess shouldn't
> be surprising, as I saw problems with that patch earlier in the
> 3.15-rc cycle:
>     https://lkml.org/lkml/2014/4/14/824
>
[...]
>
> Unfortunately reverting the change (manually, as it doesn't revert
> cleanly anymore) doesn't seem to completely avoid the issue, so the
> bisection may have gone slightly astray (though it is interesting it
> landed on the same commit I earlier had trouble with). So I'll
> back-track and double check some of the last few "good" results to
> validate I didn't just luck into 3 good boots accidentally. I'll also
> review my revert in case I missed something subtle in doing it
> manually.

So I'm getting some baffling results. I started going back over the
git bisect logs to see if I had mis-marked a revision as good due to
the issue just not reproducing.

However, despite many many reboots the last good commit in my branch
- bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3 (mmc: block: Fixup busy
detection while...) doesn't ever show the issue. While the immediately
following commit which bisect found -
e7f3d22289e4307b3071cc18b1d8ecc6598c0be4 (mmc: mmci: Handle CMD irq
before DATA irq) always does.

The immensely frustrating part is while backing that single change off
from its commit sha always makes the issue go away, reverting that
change from on top of v3.15 doesn't. The issue persists. Since it
doesn't revert cleanly, I also reverted a following patch that it
interacted with 8d94b54d99ea968a9d188ca0e68793ebed601220 (mmc: mmci:
Enable support for busy detection....) to make sure I didn't miss some
dependency and the issue *still* crops up. In fact, doing a git diff
bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3..v3.15 drivers/mmc/  doesn't
seem to resolve the issue.

So I'm really at a bit of a loss on what to do next. While it seems
that the "mmci: Handle CMD irq before DATA..." commit is problematic,
there also seems to be some other commit in v3.15 which results in the
same problematic behavior.  I may try to bisect again between the
first bad commit and v3.15, reverting the bad commit each time to see
if I can chase it down, but if anyone has better debugging tools here,
I'd greatly appreciate it.

Again, I'm happy to help interested folks get this reproducing on
their own machine for debugging.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ