lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b2967925-0a57-4a19-8657-3d9f4da83a20@app.fastmail.com>
Date:   Mon, 18 Sep 2023 20:28:02 -0700
From:   "Andy Lutomirski" <luto@...nel.org>
To:     "Brendan Jackman" <jackmanb@...gle.com>
Cc:     "Thomas Gleixner" <tglx@...utronix.de>,
        "Ingo Molnar" <mingo@...hat.com>, "Borislav Petkov" <bp@...en8.de>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        "Lai Jiangshan" <laijs@...ux.alibaba.com>, yosryahmed@...gle.com,
        reijiw@...gle.com, oweisse@...gle.com
Subject: Re: [PATCH RESEND] x86/entry: Don't write to CR3 when restoring to kernel CR3

On Thu, Aug 17, 2023, at 5:15 AM, Brendan Jackman wrote:
> From: Lai Jiangshan <laijs@...ux.alibaba.com>
>
> Skip resuming KERNEL pages since it is already KERNEL CR3
>
> Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
> Signed-off-by: Brendan Jackman <jackmanb@...gle.com>
> ---
>
> While staring at paranoid_exit I was confused about why we had this CR3
> write, avoiding it seems like a free optimisation. The original commit
> 21e94459110252 ("x86/mm: Optimize RESTORE_CR3") says "Most NMI/paranoid
> exceptions will not in fact change pagetables" but I didn't't understand
> what the "most" was referring to. I then discovered this patch on the
> mailing list, Andy said[1] that it looks correct so maybe now is the
> time to merge it?

I did?

Looking at the link, I think I was saying that the opposite patch (*always* flush) looked okay.

>
> Note there's another patch in [1] as well, the benefit of that one is
> not obvious to me though.
>
> We've tested an equivalent patch in our internal kernel.
>
> [1] https://lore.kernel.org/lkml/20200526043507.51977-3-laijs@linux.alibaba.com/
> -- >8 --
>  arch/x86/entry/calling.h | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> index f6907627172b..b2458685d56e 100644
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -236,14 +236,13 @@ For 32-bit we have the following conventions - 
> kernel is built with
>  .macro RESTORE_CR3 scratch_reg:req save_reg:req
>  	ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
> 
> -	ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
> -
>  	/*
> -	 * KERNEL pages can always resume with NOFLUSH as we do
> -	 * explicit flushes.
> +	 * Skip resuming KERNEL pages since it is already KERNEL CR3.
>  	 */
>  	bt	$PTI_USER_PGTABLE_BIT, \save_reg
> -	jnc	.Lnoflush_\@
> +	jnc	.Lend_\@

I don't get it.  How do you know that the actual loaded CR3 is correct?

I'm willing to believe that there is some constraint in the way the kernel works such that every paranoid entry will, as part of its execution, switch CR3 to kernel *and leave it like that* *and that this will be the _same_ kernel CR3 that was saved*.  But I'm not really convinced that's true.  (Can we schedule in a paranoid entry?  Probably not.  What about the weird NMI paths?  What if we do something that switches to init mm?  Hmm -- doing that in a paranoid context is probably not a brilliant idea.)

Maybe it is true, and maybe a convincing argument could be made.  But that seems like a lot of thinking and fragility for an optimization that seems pretty minor.

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ