lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200916184619.GB40154@xz-x1>
Date:   Wed, 16 Sep 2020 14:46:19 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
Cc:     John Hubbard <jhubbard@...dia.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Leon Romanovsky <leonro@...dia.com>,
        Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "Maya B . Gokhale" <gokhale2@...l.gov>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Marty Mcfadden <mcfadden8@...l.gov>,
        Kirill Shutemov <kirill@...temov.name>,
        Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
        Jan Kara <jack@...e.cz>, Kirill Tkhai <ktkhai@...tuozzo.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Christoph Hellwig <hch@....de>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

On Wed, Sep 16, 2020 at 02:48:04PM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 15, 2020 at 06:50:46PM -0700, John Hubbard wrote:
> > > 
> > > It seems very strange that a physical page exclusively owned by a
> > > process can become copied if pin_user_pages() is active and the
> > > process did fork() at some point.
> > > 
> > > Could the new pin_user_pages() logic help here? eg the
> > > GUP_PIN_COUNTING_BIAS stuff?
> > > 
> > > Could the COW code consider a refcount of GUP_PIN_COUNTING_BIAS + 1 as
> > > being owned by the current mm and not needing COW? The DMA pin would
> > > be 'invisible' for COW purposes?
> > 
> > 
> > Please do be careful to use the API, rather than the implementation. The
> > FOLL_PIN refcounting system results in being able to get a "maybe
> > DMA-pinned", or a "definitely not DMA-pinned", via this API call:
> 
> So, what I'm thinking is something like (untested):
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 469af373ae76e1..77f63183667e52 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2889,6 +2889,26 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf)
>  	return ret;
>  }
>  
> +static bool cow_needed(struct vm_fault *vmf)
> +{
> +	int total_map_swapcount;
> +
> +	if (!reuse_swap_page(vmf->page, &total_map_swapcount)) {
> +		unlock_page(vmf->page);
> +		return true;
> +	}
> +
> +	if (total_map_swapcount == 1) {
> +		/*
> +		 * The page is all ours. Move it to our anon_vma so the rmap
> +		 * code will not search our parent or siblings.  Protected
> +		 * against the rmap code by the page lock.
> +		 */
> +		page_move_anon_rmap(vmf->page, vmf->vma);
> +	}
> +	return false;
> +}
> +
>  /*
>   * This routine handles present pages, when users try to write
>   * to a shared page. It is done by copying the page to a new address
> @@ -2947,8 +2967,21 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>  		if (!trylock_page(page))
>  			goto copy;
>  		if (PageKsm(page) || page_mapcount(page) != 1 || page_count(page) != 1) {
> +			bool cow = true;
> +
> +			/*
> +			 * If the page is DMA pinned we can't rely on the
> +			 * above to know if there are other CPU references as
> +			 * page_count() will be elevated by the
> +			 * pin. Needlessly copying the page will cause the DMA
> +			 * pin to break, try harder to avoid that.
> +			 */
> +			if (page_maybe_dma_pinned(page))
> +				cow = cow_needed(vmf);
> +
>  			unlock_page(page);
> -			goto copy;
> +			if (cow)
> +				goto copy;
>  		}
>  		/*
>  		 * Ok, we've got the only map reference, and the only
> 
> What do you think Peter? Is this remotely close?

My understanding is this may only work for the case when the fork()ed child
quitted before we reach here (so we still have mapcount==1 for the page).  What
if not?  Then mapcount will be greater than 1, and cow will still trigger.  Is
that what we want?

Another problem is that, aiui, one of the major change previous patch proposed
is to avoid using lock_page() so that we never block in this path.  However if
we need the page reuse logic to guarantee pinning pages, then it becomes a
correctness issue, so I'm afraid we'll start to make all things complicated
again. Maybe even more complicated, because "correctness" should be even harder
than "best effort reuse" since it can cause data corruption if we didn't do it
right...

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ