lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Feb 2017 16:26:56 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Hugh Dickins <hughd@...gle.com>,
        Rik van Riel <riel@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCHv3 03/12] mm: fix handling PTE-mapped THPs in
 page_referenced()

On Sun 29-01-17 20:38:49, Kirill A. Shutemov wrote:
> For PTE-mapped THP page_check_address_transhuge() is not adequate: it
> cannot find all relevant PTEs, only the first one. It means we can miss
> some references of the page and it can result in suboptimal decisions by
> vmscan.
> 
> Let's switch it to page_vma_mapped_walk().
> 
> I don't think it's subject for stable@: it's not fatal. The only side
> effect is that THP can be swapped out when it shouldn't.

Please be more specific about the situation when this happens and how a
user can recognize this is going on. In other words when should I
consider backporting this series.

Also the interface is quite awkward imho. Why cannot we provide a
callback into page_vma_mapped_walk and call it for each pte/pmd that
matters to the given page? Wouldn't that be much easier than the loop
around page_vma_mapped_walk iterator?

> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> ---
>  mm/rmap.c | 66 ++++++++++++++++++++++++++++++++-------------------------------
>  1 file changed, 34 insertions(+), 32 deletions(-)
> 
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 91619fd70939..0dff8accd629 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -886,45 +886,48 @@ struct page_referenced_arg {
>  static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
>  			unsigned long address, void *arg)
>  {
> -	struct mm_struct *mm = vma->vm_mm;
>  	struct page_referenced_arg *pra = arg;
> -	pmd_t *pmd;
> -	pte_t *pte;
> -	spinlock_t *ptl;
> +	struct page_vma_mapped_walk pvmw = {
> +		.page = page,
> +		.vma = vma,
> +		.address = address,
> +	};
>  	int referenced = 0;
>  
> -	if (!page_check_address_transhuge(page, mm, address, &pmd, &pte, &ptl))
> -		return SWAP_AGAIN;
> +	while (page_vma_mapped_walk(&pvmw)) {
> +		address = pvmw.address;
>  
> -	if (vma->vm_flags & VM_LOCKED) {
> -		if (pte)
> -			pte_unmap(pte);
> -		spin_unlock(ptl);
> -		pra->vm_flags |= VM_LOCKED;
> -		return SWAP_FAIL; /* To break the loop */
> -	}
> +		if (vma->vm_flags & VM_LOCKED) {
> +			page_vma_mapped_walk_done(&pvmw);
> +			pra->vm_flags |= VM_LOCKED;
> +			return SWAP_FAIL; /* To break the loop */
> +		}
>  
> -	if (pte) {
> -		if (ptep_clear_flush_young_notify(vma, address, pte)) {
> -			/*
> -			 * Don't treat a reference through a sequentially read
> -			 * mapping as such.  If the page has been used in
> -			 * another mapping, we will catch it; if this other
> -			 * mapping is already gone, the unmap path will have
> -			 * set PG_referenced or activated the page.
> -			 */
> -			if (likely(!(vma->vm_flags & VM_SEQ_READ)))
> +		if (pvmw.pte) {
> +			if (ptep_clear_flush_young_notify(vma, address,
> +						pvmw.pte)) {
> +				/*
> +				 * Don't treat a reference through
> +				 * a sequentially read mapping as such.
> +				 * If the page has been used in another mapping,
> +				 * we will catch it; if this other mapping is
> +				 * already gone, the unmap path will have set
> +				 * PG_referenced or activated the page.
> +				 */
> +				if (likely(!(vma->vm_flags & VM_SEQ_READ)))
> +					referenced++;
> +			}
> +		} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> +			if (pmdp_clear_flush_young_notify(vma, address,
> +						pvmw.pmd))
>  				referenced++;
> +		} else {
> +			/* unexpected pmd-mapped page? */
> +			WARN_ON_ONCE(1);
>  		}
> -		pte_unmap(pte);
> -	} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> -		if (pmdp_clear_flush_young_notify(vma, address, pmd))
> -			referenced++;
> -	} else {
> -		/* unexpected pmd-mapped page? */
> -		WARN_ON_ONCE(1);
> +
> +		pra->mapcount--;
>  	}
> -	spin_unlock(ptl);
>  
>  	if (referenced)
>  		clear_page_idle(page);
> @@ -936,7 +939,6 @@ static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
>  		pra->vm_flags |= vma->vm_flags;
>  	}
>  
> -	pra->mapcount--;
>  	if (!pra->mapcount)
>  		return SWAP_SUCCESS; /* To break the loop */
>  
> -- 
> 2.11.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ