lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <91CCAB14-F9CC-4676-94C3-FBCDD0663FD5@mit.edu>
Date:	Fri, 25 Mar 2011 07:59:00 -0400
From:	Theodore Tso <tytso@....EDU>
To:	Dave Chinner <david@...morbit.com>
Cc:	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	Jens Axboe <jaxboe@...ionio.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...cle.com>
Subject: Re: [GIT PULL] Core block IO bits for 2.6.39 - early Oops


On Mar 25, 2011, at 12:41 AM, Dave Chinner wrote:

>> 
>> It works insofar as the Oops is gone. But my xfs partitions apparently
>> still get corrupted (I had to run xfs_repair on several of them, because
>> they would not mount otherwise).
> 
> So the patchset is causing repeatable filesystem corruption? Sounds
> to me like this series is not yet ready for mainline merging. Last
> thing I want to spend the .39 cycle helping people recover busted
> filesystems as a result of undercooked block layer changes...

FYI.   I did a trial merge last night of the ext4 changes last night with
the tip of Linus's tree.   The ext4 changes (based on 2.6.38-rc5) 
survived xfstests -g auto before I merged in Linus's 2.6.39 master
branch.  After I merged with 2.6.39-tip, I reran xfstests, and it got 
past test #13 (fsstress), which normally means that everything is
OK, so I sent a pull request to Linus.    Much later, (-g auto takes a 
long time) I got an OOPS inside the virtio driver.   Ext4 was nowhere 
in the stack trace, but of course the block layer was.   Grumbling
that someone  had broke virtio during the merge window, I switched
my KVM setup to use SATA emulation and used the sda devices
instead.  This time I got an oops in the block I/O layer, again quite
late in xfstests.  Somewhere around test #224 or so if I remember
correctly.

It was too late last night to do any more investigating, which is why
I hadn't sent a formal report yet, but next up is for me to retry xfstests
before merging in my changes, and then to start a git bisect.

So before accusing some patch series which hasn't been merged
into 2.6.39 yet, you might want to also worry about some change
that already has been merged.   Of course the symptoms for me are
quite different.   I'm not seeing an early oops, but only something
which shows up when the the system is put under a lot of stress
by xfstests.  So it could be a different problem....

								- Ted

P.S.  And of course there is the chance that there is some
subtle bug in the ext4 branch, which worked just fine when
it was just based on 2.6.38-rc5, but which only manifested
itself when I merged in the tip of Linus's branch.   So I'm not
__accusing__ the block layer yet, even though the stack traces
seem to point that way, because I don't have a smoking gun
yet.   But I do have to admit I'm suspicious....


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ