lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <68c30b33acf05_5addd100e0@dwillia2-mobl4.notmuch>
Date: Thu, 11 Sep 2025 10:47:31 -0700
From: <dan.j.williams@...el.com>
To: Aneesh Kumar K.V <aneesh.kumar@...nel.org>, Arto Merilainen
	<amerilainen@...dia.com>
CC: <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<aik@....com>, <lukas@...ner.de>, Samuel Ortiz <sameo@...osinc.com>, Xu Yilun
	<yilun.xu@...ux.intel.com>, Jason Gunthorpe <jgg@...pe.ca>, Suzuki K Poulose
	<Suzuki.Poulose@....com>, Steven Price <steven.price@....com>, "Catalin
 Marinas" <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>, Will
 Deacon <will@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
	<kvmarm@...ts.linux.dev>, <linux-coco@...ts.linux.dev>
Subject: Re: [RFC PATCH v1 34/38] coco: guest: arm64: Validate mmio range
 found in the interface report

Aneesh Kumar K.V wrote:
> Arto Merilainen <amerilainen@...dia.com> writes:
> 
> > On 31.7.2025 14.39, Arto Merilainen wrote:
> >> On 28.7.2025 16.52, Aneesh Kumar K.V (Arm) wrote:
> >> 
> >>> +    for (int i = 0; i < interface_report->mmio_range_count; i++, 
> >>> mmio_range++) {
> >>> +
> >>> +        /*FIXME!! units in 4K size*/
> >>> +        range_id = FIELD_GET(TSM_INTF_REPORT_MMIO_RANGE_ID, 
> >>> mmio_range->range_attributes);
> >>> +
> >>> +        /* no secure interrupts */
> >>> +        if (msix_tbl_bar != -1 && range_id == msix_tbl_bar) {
> >>> +            pr_info("Skipping misx table\n");
> >>> +            continue;
> >>> +        }
> >>> +
> >>> +        if (msix_pba_bar != -1 && range_id == msix_pba_bar) {
> >>> +            pr_info("Skipping misx pba\n");
> >>> +            continue;
> >>> +        }
> >>> +
> >> 
> >> 
> >> MSI-X and PBA can be placed to a BAR that has other registers as well. 
> >> While the PCIe specification recommends BAR-level isolation for MSI-X 
> >> structures, it is not mandated. It is enough to have sufficient 
> >> isolation within the BAR. Therefore, skipping the MSI-X and PBA BARs 
> >> altogether may leave registers unintentionally mapped via unprotected 
> >> IPA when they should have been mapped via protected IPA.
> >> 
> >> Instead of skipping the whole BAR, would it make sense to determine
> >> where the MSI-X related regions reside, and skip validation only from 
> >> these regions?
> >
> > I re-reviewed my suggestion, and what I proposed here seems wrong. 
> > However, I think there is a different more generic problem related to 
> > the MSI-X table, PBA and non-TEE ranges.
> >
> > If a BAR is sparse (e.g., it has TEE pages and the MSI-X table, PBA or 
> > non-TEE areas), the TDISP interface report may contain multiple ranges 
> > with the same range_id (/BAR id). In case a BAR contains some registers 
> > in low addresses, the MSI-X table and other registers after the MSI-X 
> > table, the interface report is expected to have two ranges for the same 
> > BAR with different "first 4k page" and "size" fields.
> >
> > This creates a tricky problem given that RSI_VDEV_VALIDATE_MAPPING 
> > requires both the ipa_base and pa_base which should correspond to the 
> > same location. In above scenario, the PA of the first range would 
> > correspond to the BAR base whereas the second range would correspond to 
> > a location residing after the MSI-X table.
> >
> > Assuming that the report contains obfuscated (but linear) physical 
> > addresses, it would be possible to create heuristics for this case. 
> > However, the fundamental problem is that none of the "first 4k page" 
> > fields in the ranges is guaranteed to correspond to the base of any BAR: 
> > Consider a case where the MSI-X table is in the beginning of a BAR and 
> > it is followed by a single TEE range. If the MSI-X is not locked, the 
> > "first 4k page" field will not correspond to the beginning of the BAR. 
> > If the realm naiviely reads the ipa_base using pci_resouce_n() and 
> > corresponding pa_base from the interface report, the addresses won't 
> > match and the validation will fail.
> >
> > It seems that interpreting the interface report cannot be done without 
> > knowledge of the device's register layout. Therefore, I don't think the 
> > ranges can be validated/remapped automatically without involving the 
> > device driver, but there should be APIs for reading the interface 
> > report, and for requesting making specific ranges protected.
> >
> 
> But we need to validate the interface report before accepting the device,
> and the device driver is only loaded after the device has been accepted.
> 
> Can we assume that only the MSI-X table and PBA ranges may be missing
> from the interface report, while all other non-secure regions are
> reported as NON-TEE ranges?
> 
> If so, we could retrieve the MSI-X guest real address details from
> config space and map the beginning of the BAR correctly.
> 
> Dan / Yilun — how is this handled in Intel TDX?
> 
> From what I can see, the AMD patches appear to encounter the same issue.

Same issue exists for TDX. In the near term this solidifies that the
PCI/TSM core should not be assumining anything with respect to marking
MMIO ranges as private, and leave that all the to low-level TSM driver.

...but then yes I expect we need to build some common infrastructure for
special casing MSIX.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ