linux-ext4 - Re: fsync() errors is unsafe and risks data loss

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180412212144.GV2801@thunk.org>
Date:   Thu, 12 Apr 2018 17:21:44 -0400
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Andres Freund <andres@...razel.de>,
        Dave Chinner <david@...morbit.com>,
        Jeff Layton <jlayton@...hat.com>,
        Andreas Dilger <adilger@...ger.ca>,
        20180410184356.GD3563@...nk.org,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        Linux FS Devel <linux-fsdevel@...r.kernel.org>,
        "Joshua D. Drake" <jd@...mandprompt.com>
Subject: Re: fsync() errors is unsafe and risks data loss

On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:
> On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:
> > I think a per-file or even per-blockdev/fs error state that'd be
> > returned by fsync() would be more than sufficient.
> 
> Ah; this was my suggestion to Jeff on IRC.  That we add a per-superblock
> wb_err and then allow syncfs() to return it.  So you'd open an fd on
> a directory (for example), and call syncfs() which would return -EIO
> or -ENOSPC if either of those conditions had occurred since you opened
> the fd.

When or how would the per-superblock wb_err flag get cleared?

Would all subsequent fsync() calls on that file system now return EIO?
Or would only all subsequent syncfs() calls return EIO?

> >  I don't see that
> > that'd realistically would trigger OOM or the inability to unmount a
> > filesystem.
> 
> Ted's referring to the current state of affairs where the writeback error
> is held in the inode; if we can't evict the inode because it's holding
> the error indicator, that can send us OOM.  If instead we transfer the
> error indicator to the superblock, then there's no problem.

Actually, I was referring to the pg-hackers original ask, which was
that after an error, all of the dirty pages that couldn't be written
out would stay dirty.

If it's only as single inode which is pinned in memory with the dirty
flag, that's bad, but it's not as bad as pinning all of the memory
pages for which there was a failed write.  We would still need to
invent some mechanism or define some semantic when it would be OK to
clear the per-inode flag and let the memory associated with that
pinned inode get released, though.

						- Ted