lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjoGnHVVp5pcb69C6HtnJ_avqQo_b5NK-X9bgoDDw_XbA@mail.gmail.com>
Date:   Mon, 24 Oct 2022 13:19:22 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Jann Horn <jannh@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        John Hubbard <jhubbard@...dia.com>, x86@...nel.org,
        willy@...radead.org, akpm@...ux-foundation.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        aarcange@...hat.com, kirill.shutemov@...ux.intel.com,
        jroedel@...e.de, ubizjak@...il.com
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

On Mon, Oct 24, 2022 at 12:58 PM Jann Horn <jannh@...gle.com> wrote:
>
> Unless I'm completely misunderstanding what's going on here, the whole
> "remove_table" thing only happens when you "remove a table", meaning
> you free an entire *pagetable*. Just zapping PTEs doesn't trigger that
> logic.

I do have to admit that I'd be happier if this code - and the GUP code
that also relies on "interrupts off" behavior - would just use a
sequence counter instead.

Relying on blocking IPI's is clever, but also clearly very subtle and
somewhat dangerous.

I think our GUP code is a *lot* more important than some "legacy
x86-32 has problems in case you have an incredibly unlikely race that
re-populates the page table with a different page that just happens to
be exactly the same MOD-4GB", so honestly, I don't think the
load-tearing is even worth worrying about - if you have hardware that
is good enough at virtualizing things, it's almost certainly already
64-bit, and running 32-bit virtual machines with PAE you really only
have yourself to blame.

So I can't find it in myself to care about the 32-bit tearing thing,
but this discussion makes me worried about Fast GUP.

Note that even with proper atomic

                pte_t pte = ptep_get_lockless(ptep);

in gup_pte_range(), and even if the page tables are RCU-free'd, that
just means that the 'ptep' access itself is safe.

But then you have the whole "the lookup of the page pointer is not
atomic" wrt that. And right now that GUP code does rely on the "block
IPI" to make it basically valid.

I don't think it matters if GUP races with munmap or madvise() or
something like that - if you get the old page, that's still a valid
page, and the user only has himself to blame.

But if we have memory pressure that causes vmscan to push out a page,
and it gets replaced with a new page, and GUP gets the old page with
no serialization, that sounds like a possible source of data
inconsistency.

I don't know if this can happen, but the whole "interrupts disabled
doesn't actually block IPI's and synchronize with TLB flushes" really
sounds like it would affect GUP too. And be much more serious there
than on some x86-32 platform that nobody should be using anyway.

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ