[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <35CA4584-4CFD-4E47-9825-6438D9ED4ECC@zytor.com>
Date: Mon, 18 Oct 2021 12:10:20 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Borislav Petkov <bp@...en8.de>,
Jane Malalane <jane.malalane@...rix.com>
CC: LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Pu Wen <puwen@...on.cn>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Cooper <andrew.cooper3@...rix.com>,
Yazen Ghannam <Yazen.Ghannam@....com>,
Brijesh Singh <brijesh.singh@....com>,
Huang Rui <ray.huang@....com>,
Andy Lutomirski <luto@...nel.org>,
Kim Phillips <kim.phillips@....com>, stable@...r.kernel.org
Subject: Re: [PATCH v2] x86/cpu: Fix migration safety with X86_BUG_NULL_SEL
AFAIK no Intel CPU has ever had that behavior, and always cleared the segments; I don't Intel has any plans of supporting such a CPUID bit (although I'd certainly be willing to take such a request back to the CPU teams on request.)
That being said, this sounds like an ideal use for the hypervisor CPU feature flag. Maybe we should consider a migration hypervisor flag too to explicitly tell the kernel not to rely on hardware probing that breaks migration in general.
Now, with a CPUID but being introduced, the right thing would be to use the CPUID bit as a feature instead of using a bug flag, and add whitelisting in the vendor-specific code as applicable.
On October 18, 2021 11:17:30 AM PDT, Borislav Petkov <bp@...en8.de> wrote:
>On Wed, Oct 13, 2021 at 03:22:30PM +0100, Jane Malalane wrote:
>> @@ -650,6 +651,27 @@ static void early_init_amd(struct cpuinfo_x86 *c)
>> if (c->x86_power & BIT(14))
>> set_cpu_cap(c, X86_FEATURE_RAPL);
>>
>> + /*
>> + * Zen1 and earlier CPUs don't clear segment base/limits when
>> + * loading a NULL selector. This has been designated
>> + * X86_BUG_NULL_SEG.
>> + *
>> + * Zen3 CPUs advertise Null Selector Clears Base in CPUID.
>> + * Zen2 CPUs also have this behaviour, but no CPUID bit.
>> + *
>> + * A hypervisor may sythesize the bit, but may also hide it
>> + * for migration safety, so we must not probe for model
>> + * specific behaviour when virtualised.
>> + */
>> + if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
>> + nscb = true;
>> +
>> + if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !nscb && c->x86 == 0x17)
>> + nscb = check_null_seg_clears_base(c);
>> +
>> + if (!nscb)
>> + set_cpu_bug(c, X86_BUG_NULL_SEG);
>> +
>> #ifdef CONFIG_X86_64
>> set_cpu_cap(c, X86_FEATURE_SYSCALL32);
>> #else
>
>Can we do something like this?
>
>It is carved out into a separate function which you can simply call from
>early_init_amd() and early_init_hygon().
>
>I guess you can put that function in arch/x86/kernel/cpu/common.c or so.
>
>Then, you should put the comments right over the code like I've done
>below so that one can follow what's going on with each particular check.
>
>I've also flipped the logic a bit and it is simpler this way.
>
>Totally untested of course.
>
>static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c)
>{
> /*
> * A hypervisor may sythesize the bit, but may also hide it
> * for migration safety, so do not probe for model-specific
> * behaviour when virtualised.
> */
> if (cpu_has(c, X86_FEATURE_HYPERVISOR))
> return;
>
> /* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */
> if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6))
> return;
>
> /* Zen2 CPUs also have this behaviour, but no CPUID bit. */
> if (c->x86 == 0x17 && check_null_seg_clears_base(c))
> return;
>
> /* All the remaining ones are affected */
> set_cpu_bug(c, X86_BUG_NULL_SEG);
>}
>
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 0f8885949e8c..2ca4afb97247 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -1395,7 +1395,7 @@ void __init early_cpu_init(void)
>> early_identify_cpu(&boot_cpu_data);
>> }
>>
>> -static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> +bool check_null_seg_clears_base(struct cpuinfo_x86 *c)
>> {
>> #ifdef CONFIG_X86_64
>> /*
>> @@ -1418,10 +1418,10 @@ static void detect_null_seg_behavior(struct cpuinfo_x86 *c)
>> wrmsrl(MSR_FS_BASE, 1);
>> loadsegment(fs, 0);
>> rdmsrl(MSR_FS_BASE, tmp);
>> - if (tmp != 0)
>> - set_cpu_bug(c, X86_BUG_NULL_SEG);
>> wrmsrl(MSR_FS_BASE, old_base);
>> + return tmp == 0;
>> #endif
>> + return true;
>> }
>>
>> static void generic_identify(struct cpuinfo_x86 *c)
>> @@ -1457,8 +1457,6 @@ static void generic_identify(struct cpuinfo_x86 *c)
>>
>> get_model_name(c); /* Default name */
>>
>> - detect_null_seg_behavior(c);
>> -
>> /*
>> * ESPFIX is a strange bug. All real CPUs have it. Paravirt
>> * systems that run Linux at CPL > 0 may or may not have the
>
>So this function is called on all x86 CPUs. Are you sure others besides
>AMD and Hygon do not have the same issue?
>
>IOW, I wouldn't remove that call here.
>
>But then this is the identify() phase in the boot process and you've
>moved it to early_identify() by putting it in the ->c_early_init()
>function pointer on AMD and Hygon.
>
>Is there any particular reasoning for this or can that detection remain
>in ->c_identify()?
>
>Because if this null seg behavior detection should happen on all
>CPUs - and I think it should, because, well, it has been that way
>until now - then the vendor specific identification minus what
>detect_null_seg_behavior() does should run first and then after
>->c_identify() is done, you should do something like:
>
> if (!cpu_has_bug(c, X86_BUG_NULL_SEG)) {
> if (!check_null_seg_clears_base(c))
> set_cpu_bug(c, X86_BUG_NULL_SEG);
> }
>
>so that it still takes place on all CPUs.
>
>I.e., you should split the detection.
>
>I hope I'm making sense ...
>
>Ah, btw, that @c parameter to detect_null_seg_behavior() is unused - you
>should remove it in a pre-patch.
>
>Thx.
>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Powered by blists - more mailing lists