linux-kernel - Re: [PATCH] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210817162239.GF1583@gate.crashing.org>
Date:   Tue, 17 Aug 2021 11:22:39 -0500
From:   Segher Boessenkool <segher@...nel.crashing.org>
To:     Christophe Leroy <christophe.leroy@...roup.eu>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>, userm57@...oo.com,
        fthain@...ux-m68k.org, linuxppc-dev@...ts.ozlabs.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP

On Tue, Aug 17, 2021 at 02:43:15PM +0000, Christophe Leroy wrote:
> Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
> removed the 'isync' instruction after adding/removing NX bit in user
> segments. The reasoning behind this change was that when setting the
> NX bit we don't mind it taking effect with delay as the kernel never
> executes text from userspace, and when clearing the NX bit this is
> to return to userspace and then the 'rfi' should synchronise the
> context.
> 
> However, it looks like on book3s/32 having a hash page table, at least
> on the G3 processor, we get an unexpected fault from userspace, then
> this is followed by something wrong in the verification of MSR_PR
> at end of another interrupt.
> 
> This is fixed by adding back the removed isync() following update
> of NX bit in user segment registers. Only do it for cores with an
> hash table, as 603 cores don't exhibit that problem and the two isync
> increase ./null_syscall selftest by 6 cycles on an MPC 832x.
> 
> First problem: unexpected PROTFAULT
> 
> 	[   62.896426] WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
> 	[   62.918111] Modules linked in:
> 	[   62.923350] CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40
> 	[   62.943476] NIP:  c001b5c8 LR: c001b6f8 CTR: 00000000
> 	[   62.954714] REGS: e2d09e40 TRAP: 0700   Not tainted  (5.13.0-pmac-00028-gb3c15b60339a)

That is not a protection fault.  What causes this?

A CSI (like isync) is required both before and after mtsr.  It may work
on some cores without -- what part of that is luck, if there is anything
that guarantees it, is anyone's guess :-/

> @@ -28,6 +30,8 @@ static inline void kuep_lock(void)
>  		return;
>  
>  	update_user_segments(mfsr(0) | SR_NX);
> +	if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
> +		isync();	/* Context sync required after mtsr() */
>  }

This needs a comment why you are not doing this for systems without
hardware page table walk, at the least?


Segher