linux-kernel - Re: [GIT PULL] gfs2 fix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wg4ypnZUA5BOHAF1miKvOhW2yQSruuBKNXMDR=dTmp+ww@mail.gmail.com>
Date:   Thu, 28 Apr 2022 10:09:26 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Andreas Gruenbacher <agruenba@...hat.com>
Cc:     cluster-devel <cluster-devel@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] gfs2 fix

On Thu, Apr 28, 2022 at 6:27 AM Andreas Gruenbacher <agruenba@...hat.com> wrote:
>
> The data corruption we've been getting unfortunately didn't have to do
> with lock contention (we already knew that); it still occurs. I'm
> running out of ideas on what to try there.

Hmm.

I don't see the bug, but I do have a suggestion on something to try.

In particular, you said the problem started with commit 00bfe02f4796
("gfs2: Fix mmap + page fault deadlocks for buffered I/O").

And to me, I see two main things that are going on

 (a) the obvious "calling generic IO functions with pagefault disabled" thing

 (b) the "allow demotion" thing

And I wonder if you could at least pinpoint which of the  cases it is
that triggers it.

So I'd love to see you try three things:

 (1) just remove the "allow demotion" cases.

     This will re-introduce the deadlock the commit is trying to fix,
but that's such a special case that I assume you can run your
test-suite that shows the problem even without that fix in place?

     This would just pinpoint whether it's due to some odd locking issue or not.

Honestly, from how you describe the symptoms, I don't think (1) is the
cause, but I think making sure is good.

It sounds much more likely that it's one of those generic vfs
functions that screws up when a page fault happens and it gets a
partial result instead of handling the fault.

Which gets us to

 (2) remove the pagefault_disable/enable() around just the
generic_file_read_iter() case in gfs2_file_read_iter().

and

 (3) finally, remove the pagefault_disable/enable() around the
iomap_file_buffered_write() case in gfs2_file_buffered_write()

Yeah, yeah, you say it's just the read that fails, but humor me on
(3), just in case it's an earlier write in your test-suite and the
read just then uncovered it.

But I put it as (3) so that you'd do the obvious (2) case first, and
narrow it down (ie if (1) still shows the bug, then do (2), and if
that fixes the bug it will be fairly well pinpointed to
generic_file_read_iter().

Looking around, gfs2 is the only thing that obviously calls
generic_file_read_iter() with pagefoaults disabled, so it does smell
like filemap_read() might have some issue, but the only thing that
does is basically that

                copied = copy_folio_to_iter(folio, offset, bytes, iter);

which should just become copy_page_to_iter_iovec(), which you'd hope
would get things right.

But it would be good to just narrow things down a bit.

I'll look at that copy_page_to_iter_iovec() some more regardless, but
doing that "let's double-check it's not somethign else" would be good.

             Linus