linux-kernel - Re: [PATCH] gfs2: Fix mmap + page fault deadlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wjk6KP3vSLFNPMjoaZ4xY4u=DjTM+C+hCS3QKt+XAE6OA@mail.gmail.com>
Date:   Thu, 1 Jul 2021 14:41:15 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Andreas Gruenbacher <agruenba@...hat.com>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        cluster-devel <cluster-devel@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Jan Kara <jack@...e.cz>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] gfs2: Fix mmap + page fault deadlocks

On Thu, Jul 1, 2021 at 1:43 PM Andreas Gruenbacher <agruenba@...hat.com> wrote:
>
> here's another attempt at fixing the mmap + page fault deadlocks we're
> seeing on gfs2.  Still not ideal because get_user_pages_fast ignores the
> current->pagefault_disabled flag

Of course get_user_pages_fast() ignores the pagefault_disabled flag,
because it doesn't do any page faults.

If you don't want to fall back to the "maybe do IO" case, you should
use the FOLL_FAST_ONLY flag - or get_user_pages_fast_only(), which
does that itself.

> For getting get_user_pages_fast changed to fix this properly, I'd need
> help from the memory management folks.

I really don't think you need anything at all from the mm people,
because we already support that whole "fast only" case.

Also, I have to say that I think the direct-IO code is fundamentally
mis-designed. Why it is doing the page lookup _during_ the IO is a
complete mystery to me. Why wasn't that done ahead of time before the
filesystem took the locks it needed?

So what the direct-IO code _should_ do is to turn an ITER_IOVEC into a
ITER_KVEC by doing the page lookup ahead of time, and none of these
issues should even exist, and then the whole pagefault_disabled and/or
FOLL_FAST_ONLY would be a complete non-issue.

Is there any reason why that isn't what it does (other than historical baggage)?

               Linus