[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whrt9ofcyonPEbgPOaCG+15mDdz+O9bb0RKrJVTt7vR4w@mail.gmail.com>
Date: Mon, 2 May 2022 11:58:31 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andreas Gruenbacher <agruenba@...hat.com>,
Christoph Hellwig <hch@...radead.org>,
"Darrick J. Wong" <djwong@...nel.org>,
Dave Chinner <dchinner@...hat.com>
Cc: cluster-devel <cluster-devel@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] gfs2 fix
On Mon, May 2, 2022 at 11:31 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> NOTE! This patch is entirely untested. I also didn't actually yet go
> look at what gfs2 does when 'bytes' and 'copied' are different.
Oh, it's a lot of generic iomap_write_end() code, so I guess it's just
as well that I brought in the iomap people.
And the iomap code has many different cases. Some of them do
if (unlikely(copied < len && !folio_test_uptodate(folio)))
return 0;
to force the whole IO to be re-done (looks sane - that's the "the
underlying folio wasn't uptodate, because we expected the write to
make it so").
And that might not have happened before, but it looks like gfs2 does
actually try to deal with that case.
But since Andreas said originally that the IO wasn't aligned, I don't
think that "not uptodate" case is what is going on, and it's more
about some "partial write in the middle of a buffer succeeded"
And the code also has things like
if (ret < len)
iomap_write_failed(iter->inode, pos, len);
which looks very wrong - it's not that the write failed, it's just
incomplete because it was done with page faults disabled. It seems to
try to do some page cache truncation based on the original 'len', but
not taking the successful part into account. Which all sounds
horrifically wrong.
But I don't know the code well enough to really judge. It just makes
me uncomfortable, and I do suspect this code may be quite buggy if the
copy of the full 'len' doesn't succeed.
Again, the patch I sent only _hides_ any issues and makes them
practically impossible to see. It doesn't really _fix_ anything, since
- as mentioned - regardless of fault_in_iov_iter_readable()
succeeding, racing with page-out could then cause the later
copy_page_from_iter_atomic() to have a partial copy anyway.
And hey, maybe there's something entirely different going on, and my
"Heureka! It might be explained by that partial write_end that
generally didn't happen before" is only my shouting at windmills.
Linus
Powered by blists - more mailing lists