linux-kernel - Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wihkGVvXXQL_qSPWF6s4NJYWyEkq+D3CUWQf9H5V1jqtg@mail.gmail.com>
Date:   Mon, 21 Dec 2020 15:30:30 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nadav Amit <nadav.amit@...il.com>
Cc:     Peter Xu <peterx@...hat.com>, Yu Zhao <yuzhao@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Pavel Emelyanov <xemul@...nvz.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        stable <stable@...r.kernel.org>,
        Minchan Kim <minchan@...nel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

On Mon, Dec 21, 2020 at 2:55 PM Nadav Amit <nadav.amit@...il.com> wrote:
>
> So as an alternative solution, I can do copying under the PTL after
> flushing, which seems to solve the problem.

I think that's a valid model, but note that we do the "re-check ptl"
in a (*completely(* different part than we do the actual PTE install.

Note that the "Re-validate under PTL" code in cow_user_page() is *not*
the "now we are installing the copy". No, that's actually for the
"uhhuh, the copy using the virtual address outside the ptl failed, now
we need to do something special".

The real "we hold teh ptl" actually happens in wp_page_copy(), after
cow_user_page() has already returned.

So you'd have to change how all of that works.

And honestly, I'm not sure it's worth it - if this was the *only*
case, then yes. But that whole "we load the original pte first, then
we do whatever we _think_ we will need to do, and then we install the
final pte after checking" is actually the case for every other page
fault handling case too.

So are we sure the COW case is so special?

I really think this is clearly just a userfaultfd bug that we hadn't
realized until now, and had possibly been hidden by timings or other
random stuff before.

    Linus