linux-kernel - Re: [GIT PULL] gfs2 fix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHc6FU4JeMHUrJbbTwEsMiPPyinQpX9fW-hz21GdjgVsvYRZkw@mail.gmail.com>
Date:   Tue, 3 May 2022 18:41:17 +0200
From:   Andreas Gruenbacher <agruenba@...hat.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Christoph Hellwig <hch@...radead.org>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Dave Chinner <dchinner@...hat.com>,
        cluster-devel <cluster-devel@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] gfs2 fix

On Tue, May 3, 2022 at 6:19 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Tue, May 3, 2022 at 1:56 AM Andreas Gruenbacher <agruenba@...hat.com> wrote:
> >
> > We still get data corruption with the patch applied. The
> > WARN_ON_ONCE(!bytes) doesn't trigger.
>
> Oh well. I was so sure that I'd finally found something.. That partial
> write case has had bugs before.
>
> > As an additional experiment, I've added code to check the iterator
> > position that iomap_file_buffered_write() returns, and it's all
> > looking good as well: an iov_iter_advance(orig_from, written) from the
> > original position always gets us to the same iterator.
>
> Yeah, I've looked at the iterator parts (and iov_iter_revert() in
> particular) multiple times, because that too is an area where we've
> had bugs before.
>
> That too may be easy to get wrong, but I couldn't for the life of me
> see any issues there.
>
> > This points at gfs2 getting things wrong after a short write, for
> > example, marking a page / folio uptodate that isn't. But the uptodate
> > handling happens at the iomap layer, so this doesn't leave me with an
> > immediate suspect.
>
> Yeah, the uptodate setting looked safe, particularly with that "if we
> copied less than we thought we would, and it wasn't uptodate, just
> claim we didn't do anything at all".
>
> That said, I now have a *new* suspect: the 'iter->pos' handling in
> iomap_write_iter().
>
> In particular, let's look at iomap_file_buffered_write(), which does:
>
>         while ((ret = iomap_iter(&iter, ops)) > 0)
>                 iter.processed = iomap_write_iter(&iter, i);
>
> and then look at what happens to iter.pos here.
>
> iomap_write_iter() does this:
>
>         loff_t pos = iter->pos;
>         ...
>                 pos += status;
>
> but it never seems to write the updated position back to the iterator.
>
> So what happens next time iomap_write_iter() gets called?
>
> This looks like such a huge bug that I'm probably missing something,
> but I wonder if this is normally hidden by the fact that usually
> iomap_write_iter() consumes the whole 'iter', so despite the 'while()'
> loop, it's actually effectively only called once.
>
> Except if it gets a short write due to an unhandled page fault..
>
> Am I entirely blind, and that 'iter.pos' is updated somewhere and I
> just missed it?

That's happening in iomap_file_buffered_write() and iomap_iter():

        while ((ret = iomap_iter(&iter, ops)) > 0)
                iter.processed = iomap_write_iter(&iter, i);

Here, iomap_write_iter() returns how much progress it has made, which
is stored in iter.processed, and iomap_iter() -> iomap_iter_advance()
then updates iter.pos and iter.len based on iter.processed.

Andreas