lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200911021705.nA2H5kHJ022851@demeter.kernel.org>
Date:	Mon, 2 Nov 2009 17:05:46 GMT
From:	bugzilla-daemon@...zilla.kernel.org
To:	linux-ext4@...r.kernel.org
Subject: [Bug 14354] Bad corruption with 2.6.32-rc1 and upwards

http://bugzilla.kernel.org/show_bug.cgi?id=14354





--- Comment #167 from Eric Sandeen <sandeen@...hat.com>  2009-11-02 17:05:38 ---
My test overnight ran successfully through > 100 iterations of the test, on a
tree checked out just prior to d0646f7b636d067d715fab52a2ba9c6f0f46b0d7.

This morning I ran that same tree with the journal checksums enabled via mount
option, saw that journal corruption was found by the checksumming code, and
immediately after that we saw the corruption.  So it is the checksum feature
being on which is breaking this for us.

Linus, I would recommend reverting d0646f7b636d067d715fab52a2ba9c6f0f46b0d7 for
now, at this late stage in the game, and those present on the ext4 call this
morning agreed.

A few things seem to have gone wrong; for one we should have at least issued a
printk when we found a bad journal checksum but we silently continued on thanks
to a RDONLY check (and the root fs is mounted readonly...)

My hand-wavy hunch about what is happening is that we're finding a bad checksum
on the last partially-written transaction, which is not surprising, but if we
have a wrapped log and we're doing the initial scan for head/tail, and we abort
scanning on that bad checksum, then we are essentially running an unrecovered
filesystem.

But that's hand-wavy and I need to go look at the code.

We lived without journal checksums on by default until now, and at this point
they're doing more harm than good, so we should revert the default-changing
commit until we can fix it and do some good power-fail testing with the fixes
in place.

I'll revert that patch and do another overnight test on an up-to-date tree to
be sure nothing else snuck in, but this looks to me like the culprit, and I'm
comfortable recommending that the commit be reverted for now.

Thanks,
-Eric

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ