lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <550a730d-07db-46d7-ac1a-b5b7a09042a6@linux.intel.com>
Date: Thu, 24 Jul 2025 11:16:09 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>,
 Nikolay Borisov <nik.borisov@...e.com>
Cc: Jianxiong Gao <jxgao@...gle.com>, "Borislav Petkov (AMD)" <bp@...en8.de>,
 Dave Hansen <dave.hansen@...ux.intel.com>,
 Dionna Glaze <dionnaglaze@...gle.com>, "H. Peter Anvin" <hpa@...or.com>,
 jgross@...e.com, "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Ingo Molnar <mingo@...hat.com>, pbonzini@...hat.com,
 Peter Gonda <pgonda@...gle.com>, Thomas Gleixner <tglx@...utronix.de>,
 Tom Lendacky <thomas.lendacky@....com>,
 Vitaly Kuznetsov <vkuznets@...hat.com>, x86@...nel.org,
 Rick Edgecombe <rick.p.edgecombe@...el.com>,
 Binbin Wu <binbin.wu@...ux.intel.com>
Subject: Re: [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX



On 7/23/2025 10:34 PM, Sean Christopherson wrote:
> On Mon, Jul 14, 2025, Nikolay Borisov wrote:
>> On 14.07.25 г. 12:06 ч., Binbin Wu wrote:
>>> On 7/10/2025 12:54 AM, Jianxiong Gao wrote:
>>>> I tested this patch on top of commit 8e690b817e38, however we are
>>>> still experiencing the same failure.
>>>>
>>> I didn't reproduce the issue with QEMU.
>>> After some comparison on how QEMU building the ACPI tables for HPET and
>>> TPM,
>>>
>>> - For HPET, the HPET range is added as Operation Region:
>>>       aml_append(dev,
>>>           aml_operation_region("HPTM", AML_SYSTEM_MEMORY,
>>> aml_int(HPET_BASE),
>>>                                HPET_LEN));
>>>
>>> - For TPM, the range is added as 32-Bit Fixed Memory Range:
>>>       if (TPM_IS_TIS_ISA(tpm_find())) {
>>>           aml_append(crs, aml_memory32_fixed(TPM_TIS_ADDR_BASE,
>>>                      TPM_TIS_ADDR_SIZE, AML_READ_WRITE));
>>>       }
>>>
>>> So, in KVM, the code patch of TPM is different from the trace for HPET in
>>> the patch https://lore.kernel.org/kvm/20250201005048.657470-3-seanjc@google.com/,
>>> HPET will trigger the code path acpi_os_map_iomem(), but TPM doesn't.
> Argh, I was looking at the wrong TPM resource when poking through QEMU.  I peeked
> at TPM_PPI_ADDR_BASE, which gets an AML_SYSTEM_MEMORY entry, not TPM_TIS_ADDR_BASE.
>
> *sigh*
>
> Note, the HPET is also enumerated as a fixed resource:
IIUC, the key differences of aml_memory32_fixed and aml_operation_region:
- aml_memory32_fixed is a resource descriptor that tells the OS about a device's
   address range for configuration or driver purposes.
   It is not meant for ACPI code runtime access, it simply reports the memory
   region to the OS via ACPI resource methods like _CRS.
- aml_operation_region actually defines how AML code can access hardware or
   regions at runtime. Fields, which can be declared within an operation region,
   are then physically mapped to addresses that the AML interpreter will read or
   write during method execution.

For HPET, the method _STA will access the "VEND" and "PRD" fields within the
region. So when probe HPET in acpi_init(), method _STA will be called, and it
needs to map the region before accessing, which triggers ioremap_cache().
(See the call chain below)

>
>      crs = aml_resource_template();
>      aml_append(crs, aml_memory32_fixed(HPET_BASE, HPET_LEN, AML_READ_ONLY));
>      aml_append(dev, aml_name_decl("_CRS", crs));
>
> If I comment out the AML_SYSTEM_MEMORY entry for HPET, the kernel's auto-mapping
> does NOT kick in (the kernel complains about required resources being missing,
> but that's expected).  So I'm pretty sure it's the _lack_ of an AML_SYSTEM_MEMORY
> entry for TPM TIS in QEMU's ACPI tables that make everything happy

Only add an AML_SYSTEM_MEMORY entry as an operation region is not enough, it
also needs an ACPI method to access a field within the region during the ACPI
device probe.

For HPET, I checked the call chain is as following:
acpi_init
   ...
     acpi_bus_get_status
       acpi_bus_get_status_handle
         acpi_evaluate_integer
           acpi_evaluate_object
             acpi_ns_evaluate
               [ACPI_TYPE_METHOD] --> *acpi_ps_execute_method*
                 acpi_ps_parse_aml
                   acpi_ps_parse_loop
                     walk_state->ascending_callback -> acpi_ds_exec_end_op
                       acpi_ds_evaluate_name_path
                         acpi_ex_resolve_to_value
                           acpi_ex_resolve_node_to_value
                             acpi_ex_read_data_from_field
                               acpi_ex_extract_from_field
                                 acpi_ex_field_datum_io
                                   *acpi_ex_access_region*
                                     acpi_ev_address_space_dispatch
acpi_ex_system_memory_space_handler
                                         acpi_os_map_memory
                                           acpi_os_map_iomem
                                             ioremap_cache

>
> I can't for the life of me suss out exactly what Google's ACPI tables will look
> like.  I'll follow-up internally to try and get an answer on that front.

I guess google has defined a ACPI method to access the region for TPM TIS during
ACPI device probe.

>
> In the meantime, can someone who has reproduced the real issue get backtraces to
> confirm or disprove that acpi_os_map_iomem() is trying to map the TPM TIS range
> as WB?  E.g. with something like so:

I tried to add an AML_SYSTEM_MEMORY entry as operation region in the ACPI
table and modify the _STA method to access the region for TPM TIS in QEMU, then
the issue can be reproduced.

diff --git a/hw/tpm/tpm_tis_isa.c b/hw/tpm/tpm_tis_isa.c
index 876cb02cb5..aca2b2993f 100644
--- a/hw/tpm/tpm_tis_isa.c
+++ b/hw/tpm/tpm_tis_isa.c
@@ -143,6 +143,9 @@ static void build_tpm_tis_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
      Aml *dev, *crs;
      TPMStateISA *isadev = TPM_TIS_ISA(adev);
      TPMIf *ti = TPM_IF(isadev);
+    Aml *field;
+    Aml *method;
+    Aml *test = aml_local(0);

      dev = aml_device("TPM");
      if (tpm_tis_isa_get_tpm_version(ti) == TPM_VERSION_2_0) {
@@ -152,7 +155,19 @@ static void build_tpm_tis_isa_aml(AcpiDevAmlIf *adev, Aml *scope)
          aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0C31")));
      }
      aml_append(dev, aml_name_decl("_UID", aml_int(1)));
-    aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+
+    aml_append(dev, aml_operation_region("TPMM", AML_SYSTEM_MEMORY, aml_int(TPM_TIS_ADDR_BASE),
+                         TPM_TIS_ADDR_SIZE));
+
+    field = aml_field("TPMM", AML_DWORD_ACC, AML_LOCK, AML_PRESERVE);
+    aml_append(field, aml_named_field("TEST", 32));
+    aml_append(dev, field);
+
+    method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+    aml_append(method, aml_store(aml_name("TEST"), test));
+    aml_append(method, aml_return(aml_int(0xF)));
+    aml_append(dev, method);

>
> diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
> index 2e7923844afe..6c3c40909ef9 100644
> --- a/arch/x86/mm/pat/memtype.c
> +++ b/arch/x86/mm/pat/memtype.c
> @@ -528,6 +528,9 @@ int memtype_reserve(u64 start, u64 end, enum page_cache_mode req_type,
>   
>          start = sanitize_phys(start);
>   
> +       WARN(start == 0xFED40000,
> +            "Mapping TPM TIS with req_type = %u\n", req_type);
> +
>          /*
>           * The end address passed into this function is exclusive, but
>           * sanitize_phys() expects an inclusive address.
> ---


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ