[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWAALF7EgxHGs-rtZwk1Fxttr56QKXeB6QssXbyXDs+kA@mail.gmail.com>
Date: Thu, 24 Oct 2019 09:11:53 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Jan Beulich <jbeulich@...e.com>,
xen-devel <xen-devel@...ts.xenproject.org>,
X86 ML <x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>
Cc: Andy Lutomirski <luto@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/stackframe/32: repair 32-bit Xen PV
On Mon, Oct 7, 2019 at 3:41 AM Jan Beulich <jbeulich@...e.com> wrote:
>
> Once again RPL checks have been introduced which don't account for a
> 32-bit kernel living in ring 1 when running in a PV Xen domain. The
> case in FIXUP_FRAME has been preventing boot; adjust BUG_IF_WRONG_CR3
> as well just in case.
I'm okay with the generated code, but IMO the macro is too indirect
for something that's trivial.
>
> Fixes: 3c88c692c287 ("x86/stackframe/32: Provide consistent pt_regs")
> Signed-off-by: Jan Beulich <jbeulich@...e.com>
>
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -48,6 +48,17 @@
>
> #include "calling.h"
>
> +#ifndef CONFIG_XEN_PV
> +# define USER_SEGMENT_RPL_MASK SEGMENT_RPL_MASK
> +#else
> +/*
> + * When running paravirtualized on Xen the kernel runs in ring 1, and hence
> + * simple mask based tests (i.e. ones not comparing against USER_RPL) have to
> + * ignore bit 0. See also the C-level get_kernel_rpl().
> + */
How about:
/*
* When running on Xen PV, the actual %cs register in the kernel is 1, not 0.
* If we need to distinguish between a %cs from kernel mode and a %cs from
* user mode, we can do test $2 instead of test $3.
*/
#define USER_SEGMENT_RPL_MASK 2
but...
> +# define USER_SEGMENT_RPL_MASK (SEGMENT_RPL_MASK & ~1)
> +#endif
> +
> .section .entry.text, "ax"
>
> /*
> @@ -172,7 +183,7 @@
> ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
> .if \no_user_check == 0
> /* coming from usermode? */
> - testl $SEGMENT_RPL_MASK, PT_CS(%esp)
> + testl $USER_SEGMENT_RPL_MASK, PT_CS(%esp)
Shouldn't PT_CS(%esp) be 0 if we came from the kernel? I'm guessing
the actual bug is in whatever code put 1 in here in the first place.
In other words, I'm having trouble understanding why there is any
context in which some value would be 3 for user mode and 1 for kernel
mode. Obviously if we're manually IRETing to kernel mode, we need to
set CS to 1, but if we're filling in our own PT_CS, we should just
write 0.
The supposedly offending commit (""x86/stackframe/32: Provide
consistent pt_regs") looks correct to me, so I suspect that the
problem is elsewhere. Or is it intentional that Xen PV's asm
(arch/x86/xen/whatever) sticks 1 into the CS field on the stack?
Also, why are we supporting 32-bit Linux PV guests at all? Can we
just delete this code instead?
Powered by blists - more mailing lists