lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <865a6dad983e4dedb9836075c210a782@EXMBDFT11.ad.twosigma.com>
Date:   Thu, 25 Jul 2019 21:22:28 +0000
From:   Geoffrey Thomas <Geoffrey.Thomas@...sigma.com>
To:     'Theodore Ts'o' <tytso@....edu>,
        Thomas Walker <Thomas.Walker@...sigma.com>
CC:     'Jan Kara' <jack@...e.cz>,
        "'linux-ext4@...r.kernel.org'" <linux-ext4@...r.kernel.org>,
        "'Darrick J. Wong'" <darrick.wong@...cle.com>
Subject: RE: Phantom full ext4 root filesystems on 4.1 through 4.14 kernels

On Friday, July 12, 2019 5:47 PM, Geoffrey Thomas <Geoffrey.Thomas@...sigma.com> wrote:
> On Friday, July 12, 2019 4:28 PM, Theodore Ts'o <tytso@....edu> wrote:
> > Hmmm... what's gid 4?  Is that a hint of where the inode might have come
> > from?
> 
> Good call, gid 4 is `adm`. And now that we have an inode number we can see
> the file's contents, it's from /var/log/account.
> 
> I bet that this is acct(2) holding onto a reference in some weird way
> (possibly involving logrotate?), which also explains why we couldn't find
> a userspace process holding onto the inode. We'll investigate a bit....

To close this out - yes, this was process accounting. Debian has a nightly cronjob which rotates the pacct logs, runs `invoke-rc.d acct restart` to reopen the file, and compresses the old log. Due to a stray policy-rc.d file from an old provisioning script, however, the restart was being skipped, and so we were unlinking and compressing the pacct file while the kernel still had it open. So it was the classic problem of an open file handle to a large deleted file, except that the open file handle was being held by the kernel.

`accton off` solved our immediate problems and freed the space. I'm not totally sure why a failed umount had that effect, too, but I suppose it turned off process accounting.

It's a little frustrating to me that the file opened by acct(2) doesn't show up to userspace (lsof doesn't seem to find it) - it'd be nice if it could show up in /proc/$some_kernel_thread/fd or somewhere, if possible.

Thanks for the help - the e2image + fsck trick is great!

-- 
Geoffrey Thomas
geofft@...sigma.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ