lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8137d98e-8825-415b-9282-1d2a115bb51a@linux.intel.com>
Date: Tue, 15 Jul 2025 10:53:06 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>,
 Nikolay Borisov <nik.borisov@...e.com>, Jianxiong Gao <jxgao@...gle.com>
Cc: "Borislav Petkov (AMD)" <bp@...en8.de>,
 Dave Hansen <dave.hansen@...ux.intel.com>,
 Dionna Glaze <dionnaglaze@...gle.com>, "H. Peter Anvin" <hpa@...or.com>,
 jgross@...e.com, "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Ingo Molnar <mingo@...hat.com>, pbonzini@...hat.com,
 Peter Gonda <pgonda@...gle.com>, Thomas Gleixner <tglx@...utronix.de>,
 Tom Lendacky <thomas.lendacky@....com>,
 Vitaly Kuznetsov <vkuznets@...hat.com>, x86@...nel.org,
 Rick Edgecombe <rick.p.edgecombe@...el.com>, kas@...nel.org,
 "Xu, Min M" <min.m.xu@...el.com>, Binbin Wu <binbin.wu@...ux.intel.com>
Subject: Re: [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX



On 7/14/2025 7:24 PM, Nikolay Borisov wrote:
>
>
> On 14.07.25 г. 12:06 ч., Binbin Wu wrote:
>>
>>
>> On 7/10/2025 12:54 AM, Jianxiong Gao wrote:
>>> I tested this patch on top of commit 8e690b817e38, however we are
>>> still experiencing the same failure.
>>>
>> I didn't reproduce the issue with QEMU.
>> After some comparison on how QEMU building the ACPI tables for HPET and TPM,
>>
>> - For HPET, the HPET range is added as Operation Region:
>>      aml_append(dev,
>>          aml_operation_region("HPTM", AML_SYSTEM_MEMORY, aml_int(HPET_BASE),
>>                               HPET_LEN));
>>
>> - For TPM, the range is added as 32-Bit Fixed Memory Range:
>>      if (TPM_IS_TIS_ISA(tpm_find())) {
>>          aml_append(crs, aml_memory32_fixed(TPM_TIS_ADDR_BASE,
>>                     TPM_TIS_ADDR_SIZE, AML_READ_WRITE));
>>      }
>>
>> So, in KVM, the code patch of TPM is different from the trace for HPET in the
>> patch https://lore.kernel.org/kvm/20250201005048.657470-3-seanjc@google.com/,
>> HPET will trigger the code path acpi_os_map_iomem(), but TPM doesn't.
>>
>> I tried to hack the code to map the region to WB first in tpm_tis driver to
>> trigger the error.
>> diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
>> index 9aa230a63616..62d303f88041 100644
>> --- a/drivers/char/tpm/tpm_tis.c
>> +++ b/drivers/char/tpm/tpm_tis.c
>> @@ -232,6 +232,7 @@ static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info)
>>          if (phy == NULL)
>>                  return -ENOMEM;
>>
>> +       ioremap_cache(tpm_info->res.start, resource_size(&tpm_info->res));
>>          phy->iobase = devm_ioremap_resource(dev, &tpm_info->res);
>>          if (IS_ERR(phy->iobase))
>>                  return PTR_ERR(phy->iobase);
>> Then I got the same error
>> [ 4.606075] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0
>> [ 4.607728] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12
>
>
> The thing is we don't really want to get into the if (pcm != new_pcm) { branch, because even if it succeeds there then the mapping will be wrong, because we want accesses to the TPM to be uncached since that's an iomem region, whereas this error shows that the new_pcm is WB.
>
> Also looking at memtype_reserve in it there is the following piece of code:
>
> if (x86_platform.is_untracked_pat_range(start, end)) {
>      7                 if (new_type)
>      6                         *new_type = _PAGE_CACHE_MODE_WB;
>      5                 return 0;
>      4         }
>
>
> So if is_untracked_pat_range returns true then the cache mode will always be WB.
So there are two different things per my understanding:
1. The patch set https://lore.kernel.org/all/20250201005048.657470-3-seanjc@google.com/
     applied to "Guest" kernel should be able to dismiss the error message.
     In __ioremap_caller(), when branch "pcm != new_pcm" is triggered, since
     is_new_memtype_allowed() returns true for TPM range, the error message
     shouldn NOT be printed.
     Otherwise, it really confuses me.

2. Setting the PAT to WB for iomem region of TPM is not desired the target type.
    Per my understanding, it should be safe to program the "iomem" to WB.
    - If the iomem region is emulated, it's OK since the accesses will be
      trapped.
    - If the iomem region is passed-through, the EPT will use UC and the
      effective memory type will be UC.


Another solution is using MTRR MSRs as the communication channel b/t guest BIOS
and guest kernel, probably need to do the following things:
For guest BIOS (i.e, OVMF), allow it to program the MTRR MSRs
- Revert 071d2cfab8 ("OvmfPkg/Sec: Skip setup MTRR early in TD-Guest")
- Revert 3a3b12cbda ("UefiCpuPkg/MtrrLib: MtrrLibIsMtrrSupported always return
   FALSE in TD-Guest")
- Make sure OVMF doesn't toggle CR0.CD, which will trigger #VE.
For guest linux kernel
- Revert 8e690b817e38 ("x86/kvm: Override default caching mode for SEV-SNP and
   TDX")
- Make sure guest kernel doesn't toggle CR0.CD, which will trigger #VE.

Sean, do you think this solution is the direction to go?

>
>
>>
>> And with Sean's patch set, the issue can be resolved.
>>
>> I guess google's VMM has built different ACPI table for TPM.
>> But according to my experiment, the issue should be able to be fixed by this
>> patch set, though I am not sure whether it will be the final solution or not.
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ