linux-kernel - Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <87r0lry3bs.fsf@jcompost-mobl.amr.corp.intel.com>
Date:   Wed, 18 Oct 2023 15:26:47 -0700
From:   "Compostella, Jeremy" <jeremy.compostella@...el.com>
To:     Adam Dunlap <acdunlap@...gle.com>
Cc:     Ingo Molnar <mingo@...nel.org>, <kirill.shutemov@...ux.intel.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        Felix Held <felix-coreboot@...ixheld.de>
Subject: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the
 correct value straight away, instead of a two-phase approach

Hi,

On both AMD and Intel platform, when memory encryption is enabled (TME
on Intel, SME or SVE on AMD), the number of physical address bits
should be lowered. Both AMD code (arch/x86/kernel/cpu/amd.c) and Intel
code (arch/x86/kernel/cpu/intel.c) support this.

I recently noticed though that Intel code is not lowering the number
of physical address bits as part of the early cpu initialization
(c_early_init) and this is leading to MTRRs sanity check failure in
generic_get_mtrr() with the following logs.

mtrr: your BIOS has configured an incorrect mask, fixing it.
mtrr: your BIOS has configured an incorrect mask, fixing it.
[...]

I have been working on fixing this following a similar approach to
what AMD code does: lower the number of physical address bits at early
initialization.
- AMD: early_init_amd() -> detect_tme()  -> c->x86_phys_bits -= [...]
- Intel: early_init_intel() -> early_detect_mem_encrypt() -> c->x86_phys_bits -= [...]

I posted the patch on the LKML (cf. <https://lore.kernel.org/lkml/65d26d679843e26fd5e6252a08391f87243a49c9.camel@intel.com/T/>)

It works just fine on v6.6-rc6. However, this morning Kirill brought
up commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct
value straight away, instead of a two-phase approach") on the tip
branch to my attention and I believe it should break the AMD early
flow and is breaking the patch I submitted on my local tests.

This commit moves the get_cpu_address_sizes() call after
the this_cpu->c_early_init() call.

@@ -1601,7 +1607,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
		 cpu_detect(c);
		 get_cpu_vendor(c);
		 get_cpu_cap(c);
- get_cpu_address_sizes(c);
  setup_force_cpu_cap(X86_FEATURE_CPUID);
  cpu_parse_early_param();

@@ -1617,6 +1622,8 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
		 setup_clear_cpu_cap(X86_FEATURE_CPUID);
	 }

+ get_cpu_address_sizes(c);
+ setup_force_cpu_cap(X86_FEATURE_ALWAYS);

  cpu_set_bug_bits(c);

In the light of commit fbf6449f84bf I am wondering what is the right
approach to fix the regression for AMD and then fix the MTRR check for
Intel. Should we introduce a new cpu_dev callback to read the number
of reserved bits and take it into account in get_cpu_address_sizes() ?

Regards,

-- 
*Jeremy*
/One Emacs to rule them all/