linux-ext4 - Re: Phantom full ext4 root filesystems on 4.1 through 4.14 kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190711152328.GB2449@quack2.suse.cz>
Date:   Thu, 11 Jul 2019 17:23:28 +0200
From:   Jan Kara <jack@...e.cz>
To:     Geoffrey Thomas <Geoffrey.Thomas@...sigma.com>
Cc:     'Jan Kara' <jack@...e.cz>,
        Thomas Walker <Thomas.Walker@...sigma.com>,
        "'linux-ext4@...r.kernel.org'" <linux-ext4@...r.kernel.org>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        "'tytso@....edu'" <tytso@....edu>
Subject: Re: Phantom full ext4 root filesystems on 4.1 through 4.14 kernels

On Thu 11-07-19 14:40:43, Geoffrey Thomas wrote:
> On Thursday, July 11, 2019 5:23 AM, Jan Kara <jack@...e.cz> wrote: 
> > On Wed 26-06-19 11:17:54, Thomas Walker wrote:
> > > Sorry to revive a rather old thread, but Elana mentioned that there might
> > > have been a related fix recently?  Possibly something to do with
> > > truncate?  A quick scan of the last month or so turned up
> > > https://www.spinics.net/lists/linux-ext4/msg65772.html but none of these
> > > seemed obviously applicable to me.  We do still experience this phantom
> > > space usage quite frequently (although the remount workaround below has
> > > lowered the priority).
> > 
> > I don't recall any fix for this. But seeing that remount "fixes" the issue
> > for you can you try whether one of the following has a similar effect?
> > 
> > 1) Try "sync"
> > 2) Try "fsfreeze -f / && fsfreeze -u /"
> > 3) Try "echo 3 >/proc/sys/vm/drop_caches"
> > 
> > Also what is the contents of
> > /sys/fs/ext4/<problematic-device>/delayed_allocation_blocks
> > when the issue happens?
> 
> We just had one of these today, and no luck from any of those.
> delayed_allocation_blocks is 1:

...

This is very strange because failed remount read-only (with EBUSY) doesn't
really do more than what "sync; echo 3 >/proc/sys/vm/drop_caches" does. I
suspect there's really some userspace taking up space and cleaning up on
umount. Anyway once this happens again, can you do:

fsfreeze -f /
e2image -r /dev/disk/by-uuid/523c8243-5a25-40eb-8627-f3bbf98ec299 - | \
  xz >some_storage.xz
fsfreeze -u /

some_storage.xz can be on some usb stick or so. It will dump ext4 metadata
to the file. Then please provide some_storage.xz for download somewhere.
Thanks! BTW I'll be on vacation next two weeks so it will take a while to
get to this...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR