linux-kernel - Re: [RFC PATCH v1 00/17] ban the use of _PAGE_XXX flags outside platform specific code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8b11dfd8-4c42-252f-c3dc-063026c49cef@c-s.fr>
Date:   Fri, 7 Sep 2018 09:41:24 +0000
From:   Christophe Leroy <christophe.leroy@....fr>
To:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>, npiggin@...il.com,
        aneesh.kumar@...ux.vnet.ibm.com
Cc:     linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH v1 00/17] ban the use of _PAGE_XXX flags outside
 platform specific code



On 09/06/2018 09:58 AM, Aneesh Kumar K.V wrote:
> Christophe Leroy <christophe.leroy@....fr> writes:
> 
>> Today flags like for instance _PAGE_RW or _PAGE_USER are used through
>> common parts of code.
>> Using those directly in common parts of code have proven to lead to
>> mistakes or misbehaviour, because their use is not always as trivial
>> as one could think.
>>
>> For instance, (flags & _PAGE_USER) == 0 isn't enough to tell
>> that a page is a kernel page, because some targets are using
>> _PAGE_PRIVILEDGED and not _PAGE_USER, so the test has to be
>> (flags & (_PAGE_USER | _PAGE_PRIVILEDGED)) == _PAGE_PRIVILEDGED
>> This has to (bad) consequences:
>>
>>   - All targets must define every bit, even the unsupported ones,
>>     leading to a lot of useless #define _PAGE_XXX 0
>>   - If someone forgets to take into account all possible _PAGE_XXX bits
>>     for the case, we can get unexpected behaviour on some targets.
>>
>> This becomes even more complex when we come to using _PAGE_RW.
>> Testing (flags & _PAGE_RW) is not enough to test whether a page
>> if writable or not, because:
>>
>>   - Some targets have _PAGE_RO instead, which has to be unset to tell
>>     a page is writable
>>   - Some targets have _PAGE_R and _PAGE_W, in which case
>>     _PAGE_RW = _PAGE_R | _PAGE_W
>>   - Even knowing whether a page is readable is not always trivial because:
>>     - Some targets requires to check that _PAGE_R is set to ensure page
>>     is readable
>>     - Some targets requires to check that _PAGE_NA is not set
>>     - Some targets requires to check that _PAGE_RO or _PAGE_RW is set
>>
>> Etc ....
>>
>> In order to work around all those issues and minimise the risks of errors,
>> this serie aims at removing all use of _PAGE_XXX flags from powerpc code
>> and always use pte_xxx() and pte_mkxxx() accessors instead. Those accessors
>> are then defined in target specific parts of the kernel code.
> 
> The series is really good. It also helps in code readability. Few things
> i am not sure there is a way to reduce the overhead
> 
> -		access = _PAGE_EXEC;
> +		access = pte_val(pte_mkexec(__pte(0)));
> 
> Considering we have multiple big endian to little endian coversion there
> for book3s 64.

Thanks for the review.

For the above, I propose the following:

diff --git a/arch/powerpc/mm/hash_utils_64.c 
b/arch/powerpc/mm/hash_utils_64.c
index f23a89d8e4ce..904ac9c84ea5 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1482,7 +1482,7 @@ static bool should_hash_preload(struct mm_struct 
*mm, unsigned long ea)
  #endif

  void hash_preload(struct mm_struct *mm, unsigned long ea,
-		  unsigned long access, unsigned long trap)
+		  bool is_exec, unsigned long trap)
  {
  	int hugepage_shift;
  	unsigned long vsid;
@@ -1490,6 +1490,7 @@ void hash_preload(struct mm_struct *mm, unsigned 
long ea,
  	pte_t *ptep;
  	unsigned long flags;
  	int rc, ssize, update_flags = 0;
+	unsigned long access = is_exec ? _PAGE_EXEC : 0;

  	BUG_ON(REGION_ID(ea) != USER_REGION_ID);

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 5c8530d0c611..4122f26a2f44 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -507,7 +507,8 @@ void update_mmu_cache(struct vm_area_struct *vma, 
unsigned long address,
  	 * We don't need to worry about _PAGE_PRESENT here because we are
  	 * called with either mm->page_table_lock held or ptl lock held
  	 */
-	unsigned long access, trap;
+	unsigned long trap;
+	bool is_exec;

  	if (radix_enabled()) {
  		prefetch((void *)address);
@@ -529,10 +530,10 @@ void update_mmu_cache(struct vm_area_struct *vma, 
unsigned long address,
  	trap = current->thread.regs ? TRAP(current->thread.regs) : 0UL;
  	switch (trap) {
  	case 0x300:
-		access = 0UL;
+		is_exec = false;
  		break;
  	case 0x400:
-		access = _PAGE_EXEC;
+		is_exec = true;
  		break;
  	default:
  		return;
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index e5d779eed181..dd7f9b951d25 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -82,7 +82,7 @@ static inline void _tlbivax_bcast(unsigned long 
address, unsigned int pid,
  #else /* CONFIG_PPC_MMU_NOHASH */

  extern void hash_preload(struct mm_struct *mm, unsigned long ea,
-			 unsigned long access, unsigned long trap);
+			 bool is_exec, unsigned long trap);


  extern void _tlbie(unsigned long address);
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index f983ffa24aa0..506e5c3e96da 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -263,7 +263,7 @@ static void __init __mapin_ram_chunk(unsigned long 
offset, unsigned long top)
  		map_kernel_page(v, p, f);
  #ifdef CONFIG_PPC_STD_MMU_32
  		if (ktext)
-			hash_preload(&init_mm, v, 0, 0x300);
+			hash_preload(&init_mm, v, false, 0x300);
  #endif
  		v += PAGE_SIZE;
  		p += PAGE_SIZE;
diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
index bea6c544e38f..38a793bfca37 100644
--- a/arch/powerpc/mm/ppc_mmu_32.c
+++ b/arch/powerpc/mm/ppc_mmu_32.c
@@ -163,7 +163,7 @@ void __init setbat(int index, unsigned long virt, 
phys_addr_t phys,
   * Preload a translation in the hash table
   */
  void hash_preload(struct mm_struct *mm, unsigned long ea,
-		  unsigned long access, unsigned long trap)
+		  bool is_exec, unsigned long trap)
  {
  	pmd_t *pmd;



> 
> Other thing is __ioremap_at where we do
> 
> +       pte_t pte = __pte(flags);
>   
>          /* Make sure we have the base flags */
> -       if ((flags & _PAGE_PRESENT) == 0)
> +       if (!pte_present(pte))

This one is using pte_raw(), so shouldn't be a problem.

Since the function is doing almost nothing of on the flags, maybe
we could just replace the above by pte_present(__pte(flags)) and
leave the rest as is.

> 
> -               err = map_kernel_page(v+i, p+i, flags);
> +               err = map_kernel_page(v + i, p + i, pte_val(pte));

Maybe another alternative would be to pass a pte_t to map_kernel_page(), 
then we have to find an optimised way to insert the RPN into it before
calling set_pte_at() instead of using pfn_pte() ?


If we are so concerned by the multiple conversions, should we modify all 
the pte_mkxxxx() to use pte_raw() and __pte_raw() instead of pte_val() 
and __pte() ?

> 
> 
> But otherwise for the series.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com>
> 

Thanks
Christophe