linux-kernel - Re: regression: 4.13 cannot follow symlinks on some ext3 fs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171126211427.GO4094@dastard>
Date:   Mon, 27 Nov 2017 08:14:27 +1100
From:   Dave Chinner <david@...morbit.com>
To:     Theodore Ts'o <tytso@....edu>, Andreas Dilger <adilger@...ger.ca>,
        Andi Kleen <andi@...stfloor.org>,
        Tahsin Erdogan <tahsin@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: regression: 4.13 cannot follow symlinks on some ext3 fs

On Sun, Nov 26, 2017 at 10:40:26AM -0500, Theodore Ts'o wrote:
> On Sun, Nov 26, 2017 at 09:32:02AM +1100, Dave Chinner wrote:
> > 
> > They don't have any whacky symlinks around, but the modern ext4 code
> > does try to eat these filesystems every so often. Extended operation
> > at ENOSPC will eventually corrupt the rootfs and crash the kernel,
> > and then I play the "e2fsck doesn't detect corruption, kernel does"
> > game to get them fixed up and working again....
> 
> If you have stack dumps or file system images which e2fsck doesn't
> detect any problems but the kernels do, please do feel free send
> reports to the ext4 mailing list.

Of course. I've done that every time I've come acros these sorts of
problems.

> > I'm running with everything up to date (debian unstable) on these
> > VMs, they are just an old filesystem because some distros have had
> > reliable rolling updates for the entire life of these VMs. :P
> 
> Or if you can make the VM's available and tell me how you are
> using/exercising them, I can try to see if I can repro the problem.

No, I can't xpamke them available. As for how I use them, they are
my test/devel VMs, so they are getting multiple kernels thrown at
them every day, and I'll just kill the VM via the qemu console (they
*never* get shut down clealy) when I need to install a new kernel.
Often they won't shut down anyway, because I've
oopsed/deadlocked/etc something on a different filesystem...

> I am wondering how you are running into ENOSPC on the root file
> systems; I take this is much more than running xfstests?

No, it isn't.  Just have a scratch filesystem failure during
xfstests such that mount fails during a "fill to enospc" test and it
will fill the root filesystem rather than the test/scratch device.
Or run a buggy test that dumps everything in $here. Or fill /tmp
without noticing it.  Then let fstests continue to run trying to
write state and logs for the next 500 tests...

> Are you
> running some benchmarks that are logging into the root, and that's
> triggering the ENOSPC condition?

No, I'm not doing anything like that on these machines. It's
straight forward "something filled the root fs unexpectedly" type of
error which I don't notice immediately...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com