linux-kernel - Re: [PATCH 3/3] overlayfs: Report writeback errors on upper

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201223202140.GB11012@ircssh-2.c.rugged-nimbus-611.internal>
Date:   Wed, 23 Dec 2020 20:21:41 +0000
From:   Sargun Dhillon <sargun@...gun.me>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Vivek Goyal <vgoyal@...hat.com>, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-unionfs@...r.kernel.org,
        jlayton@...nel.org, amir73il@...il.com, miklos@...redi.hu,
        jack@...e.cz, neilb@...e.com, viro@...iv.linux.org.uk, hch@....de
Subject: Re: [PATCH 3/3] overlayfs: Report writeback errors on upper

On Wed, Dec 23, 2020 at 08:07:46PM +0000, Matthew Wilcox wrote:
> On Wed, Dec 23, 2020 at 07:29:41PM +0000, Sargun Dhillon wrote:
> > On Wed, Dec 23, 2020 at 06:50:44PM +0000, Matthew Wilcox wrote:
> > > On Wed, Dec 23, 2020 at 06:20:27PM +0000, Sargun Dhillon wrote:
> > > > I fail to see why this is neccessary if you incorporate error reporting into the 
> > > > sync_fs callback. Why is this separate from that callback? If you pickup Jeff's
> > > > patch that adds the 2nd flag to errseq for "observed", you should be able to
> > > > stash the first errseq seen in the ovl_fs struct, and do the check-and-return
> > > > in there instead instead of adding this new infrastructure.
> > > 
> > > You still haven't explained why you want to add the "observed" flag.
> > 
> > 
> > In the overlayfs model, many users may be using the same filesystem (super block)
> > for their upperdir. Let's say you have something like this:
> > 
> > /workdir [Mounted FS]
> > /workdir/upperdir1 [overlayfs upperdir]
> > /workdir/upperdir2 [overlayfs upperdir]
> > /workdir/userscratchspace
> > 
> > The user needs to be able to do something like:
> > sync -f ${overlayfs1}/file
> > 
> > which in turn will call sync on the the underlying filesystem (the one mounted 
> > on /workdir), and can check if the errseq has changed since the overlayfs was
> > mounted, and use that to return an error to the user.
> 
> OK, but I don't see why the current scheme doesn't work for this.  If
> (each instance of) overlayfs samples the errseq at mount time and then
> check_and_advances it at sync time, it will see any error that has occurred
> since the mount happened (and possibly also an error which occurred before
> the mount happened, but hadn't been reported to anybody before).
> 

If there is an outstanding error at mount time, and the SEEN flag is unset, 
subsequent errors will not increment the counter, until the user calls sync on
the upperdir's filesystem. If overlayfs calls check_and_advance on the upperdir's
super block at any point, it will then set the seen block, and if the user calls
syncfs on the upperdir, it will not return that there is an outstanding error,
since overlayfs just cleared it.


> > If we do not advance the errseq on the upperdir to "mark it as seen", that means 
> > future errors will not be reported if the user calls sync -f ${overlayfs1}/file,
> > because errseq will not increment the value if the seen bit is unset.
> > 
> > On the other hand, if we mark it as seen, then if the user calls sync on 
> > /workdir/userscratchspace/file, they wont see the error since we just set the 
> > SEEN flag.
> 
> While we set the SEEN flag, if the file were opened before the error
> occurred, we would still report the error because the sequence is higher
> than it was when we sampled the error.
> 

Right, this isn't a problem for people calling f(data)sync on a particular file, 
because it takes its own snapshot of errseq. This is only problematic for folks 
calling syncfs. In Jeff's other messages, it sounded like this behaviour is
pretty important, and the likes of postgresql depend on it.