linux-kernel - Re: [PATCH v5 7/7] mm: Remove the now-unnecessary mmget_still

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.11.2008311346370.3722@eggly.anvils>
Date:   Mon, 31 Aug 2020 14:30:42 -0700 (PDT)
From:   Hugh Dickins <hughd@...gle.com>
To:     Jann Horn <jannh@...gle.com>
cc:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Christoph Hellwig <hch@....de>,
        kernel list <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH v5 7/7] mm: Remove the now-unnecessary mmget_still_valid()
 hack

I didn't answer your questions further down, sorry, resuming...

On Mon, 31 Aug 2020, Jann Horn wrote:
> On Mon, Aug 31, 2020 at 8:07 AM Hugh Dickins <hughd@...gle.com> wrote:
...
> > but the "pmd .. physical page 0" issue is explained better in its parent
> > 18e77600f7a1 ("khugepaged: retract_page_tables() remember to test exit")
...
> Just to clarify: This is an issue only between GUP's software page

Not just GUP's software page table walks: any of our software page
table walks that could occur concurrently (notably, unmapping when
exiting).

> table walks when running without mmap_lock and concurrent page table
> modifications from hugepage code, correct?

Correct.

> Hardware page table walks

Have no problem: the necessary TLB flush is already done.

> and get_user_pages_fast() are fine because they properly load PTEs
> atomically and are written to assume that the page tables can change
> arbitrarily under them, and the only guarantee is that disabling
> interrupts ensures that pages referenced by PTEs can't be freed,
> right?

mm/gup.c has changed a lot since I was familiar with it, and I'm
out of touch with the history of architectural variants.  I think
internal_get_user_pages_fast() is now the place to look, and I see

		local_irq_save(flags);
		gup_pgd_range(addr, end, fast_flags, pages, &nr_pinned);
		local_irq_restore(flags);

reassuringly there, which is how x86 always used to do it,
and the dependence of x86 TLB flush on IPIs made it all safe.

Looking at gup_pmd_range(), its operations on pmd (= READ_ONCE(*pmdp))
look correct to me, and where I said "any of our software page table
walks" above, there should be an exception for GUP_fast.

But the other software page table walks are more loosely coded, and
less able to fall back - if gup_pmd_range() catches sight of a fleeting
*pmdp 0, it rightly just gives up immediately on !pmd_present(pmd);
whereas tearing down a userspace mapping needs to wait or retry on
seeing a transient state (but mmap_lock happens to give protection
against that particular transient state).

I assume that all the architectures which support GUP_fast have now
been gathered into the same mechanism (perhaps by an otherwise
superfluous IPI on TLB flush?) and are equally safe.

Hugh