lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220121085917.GA22849@worktop.programming.kicks-ass.net>
Date:   Fri, 21 Jan 2022 09:59:17 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Hildenbrand <david@...hat.com>
Cc:     mingo@...hat.com, tglx@...utronix.de, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-api@...r.kernel.org, x86@...nel.org,
        pjt@...gle.com, posk@...gle.com, avagin@...gle.com,
        jannh@...gle.com, tdelisle@...terloo.ca, mark.rutland@....com,
        posk@...k.io
Subject: Re: [RFC][PATCH v2 1/5] mm: Avoid unmapping pinned pages

On Fri, Jan 21, 2022 at 08:51:57AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 20, 2022 at 07:25:08PM +0100, David Hildenbrand wrote:
> > On 20.01.22 16:55, Peter Zijlstra wrote:
> > > Add a guarantee for Anon pages that pin_user_page*() ensures the
> > > user-mapping of these pages stay preserved. In order to ensure this
> > > all rmap users have been audited:
> > > 
> > >  vmscan:	already fails eviction due to page_maybe_dma_pinned()
> > > 
> > >  migrate:	migration will fail on pinned pages due to
> > > 		expected_page_refs() not matching, however that is
> > > 		*after* try_to_migrate() has already destroyed the
> > > 		user mapping of these pages. Add an early exit for
> > > 		this case.
> > > 
> > >  numa-balance:	as per the above, pinned pages cannot be migrated,
> > > 		however numa balancing scanning will happily PROT_NONE
> > > 		them to get usage information on these pages. Avoid
> > > 		this for pinned pages.
> > 
> > page_maybe_dma_pinned() can race with GUP-fast without
> > mm->write_protect_seq. This is a real problem for vmscan() with
> > concurrent GUP-fast as it can result in R/O mappings of pinned pages and
> > GUP will lose synchronicity to the page table on write faults due to
> > wrong COW.
> 
> Urgh, so yeah, that might be a problem. Follow up code uses it like
> this:
> 
> +/*
> + * Pinning a page inhibits rmap based unmap for Anon pages. Doing a load
> + * through the user mapping ensures the user mapping exists.
> + */
> +#define umcg_pin_and_load(_self, _pagep, _member)                              \
> +({                                                                             \
> +       __label__ __out;                                                        \
> +       int __ret = -EFAULT;                                                    \
> +                                                                               \
> +       if (pin_user_pages_fast((unsigned long)(_self), 1, 0, &(_pagep)) != 1)  \
> +               goto __out;                                                     \
> +                                                                               \
> +       if (!PageAnon(_pagep) ||                                                \
> +           get_user(_member, &(_self)->_member)) {                             \
> +               unpin_user_page(_pagep);                                        \
> +               goto __out;                                                     \
> +       }                                                                       \
> +       __ret = 0;                                                              \
> +__out: __ret;                                                                  \
> +})
> 
> And after that hard assumes (on the penalty of SIGKILL) that direct user
> access works. Specifically it does RmW ops on it. So I suppose I'd
> better upgrade that load to a RmW at the very least.
> 
> But is that sufficient? Let me go find that race you mention...

OK, so copy_page_range() vs lockless_pages_from_mm(). Since I use
FOLL_PIN that should be sorted, it'll fall back the slow path and use
mmap_sem and serialize against the fork().

(Also, can I express my hate for __gup_longterm_unlocked(), that
function name is utter garbage)

However, I'm not quite sure what fork() does with pages that have a pin.
There's been a number of GUP vs fork() problems over the years, but I'm
afraid I have lost track of that and I can't quickly find anything in
the code..

Naively, a page that has async DMA activity should not be CoW'ed, or if
it is, care must be taken to ensure the original pages stays in the
original process, but I realize that's somewhat hard.

Let me dig in a bit more.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ