lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <534D09AC.7020704@citrix.com>
Date:	Tue, 15 Apr 2014 11:27:56 +0100
From:	David Vrabel <david.vrabel@...rix.com>
To:	Mel Gorman <mgorman@...e.de>
CC:	Linux-X86 <x86@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Cyrill Gorcunov <gorcunov@...il.com>,
	Peter Anvin <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
	Steven Noonan <steven@...inklabs.net>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Dave Hansen <dave.hansen@...el.com>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/5] mm: use paravirt friendly ops for NUMA hinting ptes

On 08/04/14 14:09, Mel Gorman wrote:
> David Vrabel identified a regression when using automatic NUMA balancing
> under Xen whereby page table entries were getting corrupted due to the
> use of native PTE operations. Quoting him
> 
> 	Xen PV guest page tables require that their entries use machine
> 	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
> 	successful migration) non-present PTEs must use pseudo-physical
> 	addresses.  This is because on migration MFNs in present PTEs are
> 	translated to PFNs (canonicalised) so they may be translated back
> 	to the new MFN in the destination domain (uncanonicalised).
> 
> 	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
> 	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
> 	pte_clear_flags(), etc.
> 
> 	In a Xen PV guest, these functions must translate MFNs to PFNs
> 	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
> 	_PAGE_PRESENT.
> 
> His suggested fix converted p[te|md]_[set|clear]_flags to using
> paravirt-friendly ops but this is overkill. He suggested an alternative of
> using p[te|md]_modify in the NUMA page table operations but this is does
> more work than necessary and would require looking up a VMA for protections.
> 
> This patch modifies the NUMA page table operations to use paravirt friendly
> operations to set/clear the flags of interest. Unfortunately this will take
> a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not
> see a way around it that does not break Xen.

We're getting more reports of users hitting this regression with distro
provided kernels.  Irrespective of the rest of this series, can we get
at least this applied and tagged for stable, please?

http://lists.xenproject.org/archives/html/xen-devel/2014-04/msg01905.html

David

> 
> Signed-off-by: Mel Gorman <mgorman@...e.de>
> ---
>  include/asm-generic/pgtable.h | 31 +++++++++++++++++++++++--------
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 34c7bdc..38a7437 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -680,24 +680,35 @@ static inline int pmd_numa(pmd_t pmd)
>  #ifndef pte_mknonnuma
>  static inline pte_t pte_mknonnuma(pte_t pte)
>  {
> -	pte = pte_clear_flags(pte, _PAGE_NUMA);
> -	return pte_set_flags(pte, _PAGE_PRESENT|_PAGE_ACCESSED);
> +	pteval_t val = pte_val(pte);
> +
> +	val &= ~_PAGE_NUMA;
> +	val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
> +	return __pte(val);
>  }
>  #endif
>  
>  #ifndef pmd_mknonnuma
>  static inline pmd_t pmd_mknonnuma(pmd_t pmd)
>  {
> -	pmd = pmd_clear_flags(pmd, _PAGE_NUMA);
> -	return pmd_set_flags(pmd, _PAGE_PRESENT|_PAGE_ACCESSED);
> +	pmdval_t val = pmd_val(pmd);
> +
> +	val &= ~_PAGE_NUMA;
> +	val |= (_PAGE_PRESENT|_PAGE_ACCESSED);
> +
> +	return __pmd(val);
>  }
>  #endif
>  
>  #ifndef pte_mknuma
>  static inline pte_t pte_mknuma(pte_t pte)
>  {
> -	pte = pte_set_flags(pte, _PAGE_NUMA);
> -	return pte_clear_flags(pte, _PAGE_PRESENT);
> +	pteval_t val = pte_val(pte);
> +
> +	val &= ~_PAGE_PRESENT;
> +	val |= _PAGE_NUMA;
> +
> +	return __pte(val);
>  }
>  #endif
>  
> @@ -716,8 +727,12 @@ static inline void ptep_set_numa(struct mm_struct *mm, unsigned long addr,
>  #ifndef pmd_mknuma
>  static inline pmd_t pmd_mknuma(pmd_t pmd)
>  {
> -	pmd = pmd_set_flags(pmd, _PAGE_NUMA);
> -	return pmd_clear_flags(pmd, _PAGE_PRESENT);
> +	pmdval_t val = pmd_val(pmd);
> +
> +	val &= ~_PAGE_PRESENT;
> +	val |= _PAGE_NUMA;
> +
> +	return __pmd(val);
>  }
>  #endif
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ