lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1418874364.198277.1586967776509@privateemail.com>
Date:   Wed, 15 Apr 2020 11:22:56 -0500 (CDT)
From:   Christopher M Riedl <cmr@...ormatik.wtf>
To:     Christophe Leroy <christophe.leroy@....fr>
Cc:     linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH] powerpc/lib: Fixing use a temporary mm for code
 patching

> On April 15, 2020 4:12 AM Christophe Leroy <christophe.leroy@....fr> wrote:
> 
>  
> Le 15/04/2020 à 07:16, Christopher M Riedl a écrit :
> >> On March 26, 2020 9:42 AM Christophe Leroy <christophe.leroy@....fr> wrote:
> >>
> >>   
> >> This patch fixes the RFC series identified below.
> >> It fixes three points:
> >> - Failure with CONFIG_PPC_KUAP
> >> - Failure to write do to lack of DIRTY bit set on the 8xx
> >> - Inadequaly complex WARN post verification
> >>
> >> However, it has an impact on the CPU load. Here is the time
> >> needed on an 8xx to run the ftrace selftests without and
> >> with this series:
> >> - Without CONFIG_STRICT_KERNEL_RWX		==> 38 seconds
> >> - With CONFIG_STRICT_KERNEL_RWX			==> 40 seconds
> >> - With CONFIG_STRICT_KERNEL_RWX + this series	==> 43 seconds
> >>
> >> Link: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=166003
> >> Signed-off-by: Christophe Leroy <christophe.leroy@....fr>
> >> ---
> >>   arch/powerpc/lib/code-patching.c | 5 ++++-
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> >> index f156132e8975..4ccff427592e 100644
> >> --- a/arch/powerpc/lib/code-patching.c
> >> +++ b/arch/powerpc/lib/code-patching.c
> >> @@ -97,6 +97,7 @@ static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
> >>   	}
> >>   
> >>   	pte = mk_pte(page, pgprot);
> >> +	pte = pte_mkdirty(pte);
> >>   	set_pte_at(patching_mm, patching_addr, ptep, pte);
> >>   
> >>   	init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> >> @@ -168,7 +169,9 @@ static int do_patch_instruction(unsigned int *addr, unsigned int instr)
> >>   			(offset_in_page((unsigned long)addr) /
> >>   				sizeof(unsigned int));
> >>   
> >> +	allow_write_to_user(patch_addr, sizeof(instr));
> >>   	__patch_instruction(addr, instr, patch_addr);
> >> +	prevent_write_to_user(patch_addr, sizeof(instr));
> >>
> > 
> > On radix we can map the page with PAGE_KERNEL protection which ends up
> > setting EAA[0] in the radix PTE. This means the KUAP (AMR) protection is
> > ignored (ISA v3.0b Fig. 35) since we are accessing the page from MSR[PR]=0.
> > 
> > Can we employ a similar approach on the 8xx? I would prefer *not* to wrap
> > the __patch_instruction() with the allow_/prevent_write_to_user() KUAP things
> > because this is a temporary kernel mapping which really isn't userspace in
> > the usual sense.
> 
> On the 8xx, that's pretty different.
> 
> The PTE doesn't control whether a page is user page or a kernel page. 
> The only thing that is set in the PTE is whether a page is linked to a 
> given PID or not.
> PAGE_KERNEL tells that the page can be addressed with any PID.
> 
> The user access right is given by a kind of zone, which is in the PGD 
> entry. Every pages above PAGE_OFFSET are defined as belonging to zone 0. 
> Every pages below PAGE_OFFSET are defined as belonging to zone 1.
> 
> By default, zone 0 can only be accessed by kernel, and zone 1 can only 
> be accessed by user. When kernel wants to access zone 1, it temporarily 
> changes properties of zone 1 to allow both kernel and user accesses.
> 
> So, if your mapping is below PAGE_OFFSET, it is in zone 1 and kernel 
> must unlock it to access it.
> 
> 
> And this is more or less the same on hash/32. This is managed by segment 
> registers. One segment register corresponds to a 256Mbytes area. Every 
> pages below PAGE_OFFSET can only be read by default by kernel. Only user 
> can write if the PTE allows it. When the kernel needs to write at an 
> address below PAGE_OFFSET, it must change the segment properties in the 
> corresponding segment register.
> 
> So, for both cases, if we want to have it local to a task while still 
> allowing kernel access, it means we have to define a new special area 
> between TASK_SIZE and PAGE_OFFSET which belongs to kernel zone.
> 
> That looks complex to me for a small benefit, especially as 8xx is not 
> SMP and neither are most of the hash/32 targets.
> 

Agreed. So I guess the solution is to differentiate between radix/non-radix
and use PAGE_SHARED for non-radix along with the KUAP functions when KUAP
is enabled. Hmm, I need to think about this some more, especially if it's
acceptable to temporarily map kernel text as PAGE_SHARED for patching. Do
you see any obvious problems on 8xx and hash/32 w/ using PAGE_SHARED?

I don't necessarily want to drop the local mm patching idea for non-radix
platforms since that means we would have to maintain two implementations.

> Christophe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ