lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+i-1C2JjKRNJQhsaXk0kYfre40X61iC9807V+Zu-4iBKNF0Hw@mail.gmail.com>
Date:   Wed, 23 Aug 2023 11:42:56 -0700
From:   Brendan Jackman <jackmanb@...gle.com>
To:     luto@...nel.org, Peter Zijlstra <peterz@...radead.org>
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        linux-kernel@...r.kernel.org, laijs@...ux.alibaba.com,
        yosryahmed@...gle.com, reijiw@...gle.com, oweisse@...gle.com
Subject: Re: [PATCH RESEND] x86/entry: Don't write to CR3 when restoring to
 kernel CR3

Oops, just noticed I didn't put +Peter in the recipients.

On Thu, 17 Aug 2023 at 05:15, Brendan Jackman <jackmanb@...gle.com> wrote:
>
> From: Lai Jiangshan <laijs@...ux.alibaba.com>
>
> Skip resuming KERNEL pages since it is already KERNEL CR3
>
> Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
> Signed-off-by: Brendan Jackman <jackmanb@...gle.com>
> ---
>
> While staring at paranoid_exit I was confused about why we had this CR3
> write, avoiding it seems like a free optimisation. The original commit
> 21e94459110252 ("x86/mm: Optimize RESTORE_CR3") says "Most NMI/paranoid
> exceptions will not in fact change pagetables" but I didn't't understand
> what the "most" was referring to. I then discovered this patch on the
> mailing list, Andy said[1] that it looks correct so maybe now is the
> time to merge it?
>
> Note there's another patch in [1] as well, the benefit of that one is
> not obvious to me though.

Also expanding on this a bit: the main purpose of this code is to
switch back to the user address space after handling one of these
"paranoid" exceptions. When one of those exceptions interrupts kernel
mode, we didn't switch from user to kernel address space so the
restore isn't needed.

There's no other reason to change CR3 here; context switches don't
happen in these exceptions but even if they did we would restore CR3
from the returning context_switch path. In fact in that case doing it
in paranoid_exit would potentially use the wrong PCID if it had been
reallocated by choose_new_asid. (And on the other side of the coin if
the restore was needed, it would be needed under !KPTI too).

> We've tested an equivalent patch in our internal kernel.
>
> [1] https://lore.kernel.org/lkml/20200526043507.51977-3-laijs@linux.alibaba.com/
> -- >8 --
>  arch/x86/entry/calling.h | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> index f6907627172b..b2458685d56e 100644
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -236,14 +236,13 @@ For 32-bit we have the following conventions - kernel is built with
>  .macro RESTORE_CR3 scratch_reg:req save_reg:req
>         ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
>
> -       ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
> -
>         /*
> -        * KERNEL pages can always resume with NOFLUSH as we do
> -        * explicit flushes.
> +        * Skip resuming KERNEL pages since it is already KERNEL CR3.
>          */
>         bt      $PTI_USER_PGTABLE_BIT, \save_reg
> -       jnc     .Lnoflush_\@
> +       jnc     .Lend_\@
> +
> +       ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
>
>         /*
>          * Check if there's a pending flush for the user ASID we're
> @@ -261,10 +260,6 @@ For 32-bit we have the following conventions - kernel is built with
>         SET_NOFLUSH_BIT \save_reg
>
>  .Lwrcr3_\@:
> -       /*
> -        * The CR3 write could be avoided when not changing its value,
> -        * but would require a CR3 read *and* a scratch register.
> -        */
>         movq    \save_reg, %cr3
>  .Lend_\@:
>  .endm
> --
> 2.41.0.694.ge786442a9b-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ