[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54FF4244.5080600@redhat.com>
Date: Tue, 10 Mar 2015 20:13:08 +0100
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Andy Lutomirski <luto@...capital.net>, x86@...nel.org,
linux-kernel@...r.kernel.org
CC: Borislav Petkov <bp@...en8.de>, Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH 3/3] x86_32: Document our abuse of ss1 and sp1
On 03/10/2015 07:06 PM, Andy Lutomirski wrote:
> This has confused me for a while. Now that I figured it out,
> document it.
Great!
> Signed-off-by: Andy Lutomirski <luto@...capital.net>
> ---
> arch/x86/include/asm/processor.h | 21 ++++++++++++++++++---
> 1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index fc6d8d0d8d53..b26208998b7c 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -209,9 +209,24 @@ struct x86_hw_tss {
> unsigned short back_link, __blh;
> unsigned long sp0;
> unsigned short ss0, __ss0h;
> - unsigned long sp1;
> - /* ss1 caches MSR_IA32_SYSENTER_CS: */
> - unsigned short ss1, __ss1h;
> +
> + /*
> + * We don't use ring 1, so sp1 and ss1 are convenient scratch
> + * spaces in the same cacheline as sp0. We use them to cache
> + * some MSR values to avoid unnecessary wrmsr instructions.
I don't see where exactly tss.ss1/sp1 is getting used as cache.
Grepping for "sp1" string, I found only this:
$ grep -r '[.>]e*sp1' .
./kernel/cpu/common.c: tss->x86_tss.sp1 = sizeof(struct tss_struct) + (unsigned long) tss;
./kernel/cpu/common.c: wrmsr(MSR_IA32_SYSENTER_ESP, tss->x86_tss.sp1, 0);
void enable_sep_cpu(void)
{
int cpu = get_cpu();
struct tss_struct *tss = &per_cpu(init_tss, cpu);
...
tss->x86_tss.ss1 = __KERNEL_CS;
tss->x86_tss.sp1 = sizeof(struct tss_struct) + (unsigned long) tss;
wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);
wrmsr(MSR_IA32_SYSENTER_ESP, tss->x86_tss.sp1, 0);
wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long) ia32_sysenter_target, 0);
put_cpu();
}
It's trivial to rewrite this wrmsr(MSR_IA32_SYSENTER_ESP)
without the detour through x86_tss.sp1.
Apart from this, x86_tss.sp1 appears unused... ????confused????
.ss1 also seems to be a write-only field:
$ grep -r '[.>]ss1' .
./include/asm/processor.h: if (unlikely(tss->x86_tss.ss1 != thread->sysenter_cs)) {
./include/asm/processor.h: tss->x86_tss.ss1 = thread->sysenter_cs;
./include/asm/processor.h: .ss1 = __KERNEL_CS, \
./kernel/cpu/common.c: tss->x86_tss.ss1 = __KERNEL_CS;
> + *
> + * We use SYSENTER_ESP to find sp0 and for the NMI emergency
> + * stack,
We use what? SYSENTER_ESP is a MSR, right? We don't use it (the MSR)
to find anything... I don't understand what you are saying here.
but we need to context switch it because we do
> + * horrible things to the kernel stack in vm86 mode.
> + *
> + * We use SYSENTER_CS to disable sysenter in vm86 mode to avoid
> + * corrupting the stack if we went through the sysenter path
> + * from vm86 mode.
> + */
I'm confused how loading ss1/sp1 with anything can disable sysenter.
SYSENTER insn does not use those fields.
What you _can_ disable is you can make it impossible to enter RING1
if tss.ss1 is invalid.
> + unsigned long sp1; /* MSR_IA32_SYSENTER_ESP */
> + unsigned short ss1; /* MSR_IA32_SYSENTER_CS */
The comments in the right don't explain anything (to me, at least).
Sorry for sounding negative.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists