[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <68c30b33acf05_5addd100e0@dwillia2-mobl4.notmuch>
Date: Thu, 11 Sep 2025 10:47:31 -0700
From: <dan.j.williams@...el.com>
To: Aneesh Kumar K.V <aneesh.kumar@...nel.org>, Arto Merilainen
<amerilainen@...dia.com>
CC: <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<aik@....com>, <lukas@...ner.de>, Samuel Ortiz <sameo@...osinc.com>, Xu Yilun
<yilun.xu@...ux.intel.com>, Jason Gunthorpe <jgg@...pe.ca>, Suzuki K Poulose
<Suzuki.Poulose@....com>, Steven Price <steven.price@....com>, "Catalin
Marinas" <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>, Will
Deacon <will@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
<kvmarm@...ts.linux.dev>, <linux-coco@...ts.linux.dev>
Subject: Re: [RFC PATCH v1 34/38] coco: guest: arm64: Validate mmio range
found in the interface report
Aneesh Kumar K.V wrote:
> Arto Merilainen <amerilainen@...dia.com> writes:
>
> > On 31.7.2025 14.39, Arto Merilainen wrote:
> >> On 28.7.2025 16.52, Aneesh Kumar K.V (Arm) wrote:
> >>
> >>> + for (int i = 0; i < interface_report->mmio_range_count; i++,
> >>> mmio_range++) {
> >>> +
> >>> + /*FIXME!! units in 4K size*/
> >>> + range_id = FIELD_GET(TSM_INTF_REPORT_MMIO_RANGE_ID,
> >>> mmio_range->range_attributes);
> >>> +
> >>> + /* no secure interrupts */
> >>> + if (msix_tbl_bar != -1 && range_id == msix_tbl_bar) {
> >>> + pr_info("Skipping misx table\n");
> >>> + continue;
> >>> + }
> >>> +
> >>> + if (msix_pba_bar != -1 && range_id == msix_pba_bar) {
> >>> + pr_info("Skipping misx pba\n");
> >>> + continue;
> >>> + }
> >>> +
> >>
> >>
> >> MSI-X and PBA can be placed to a BAR that has other registers as well.
> >> While the PCIe specification recommends BAR-level isolation for MSI-X
> >> structures, it is not mandated. It is enough to have sufficient
> >> isolation within the BAR. Therefore, skipping the MSI-X and PBA BARs
> >> altogether may leave registers unintentionally mapped via unprotected
> >> IPA when they should have been mapped via protected IPA.
> >>
> >> Instead of skipping the whole BAR, would it make sense to determine
> >> where the MSI-X related regions reside, and skip validation only from
> >> these regions?
> >
> > I re-reviewed my suggestion, and what I proposed here seems wrong.
> > However, I think there is a different more generic problem related to
> > the MSI-X table, PBA and non-TEE ranges.
> >
> > If a BAR is sparse (e.g., it has TEE pages and the MSI-X table, PBA or
> > non-TEE areas), the TDISP interface report may contain multiple ranges
> > with the same range_id (/BAR id). In case a BAR contains some registers
> > in low addresses, the MSI-X table and other registers after the MSI-X
> > table, the interface report is expected to have two ranges for the same
> > BAR with different "first 4k page" and "size" fields.
> >
> > This creates a tricky problem given that RSI_VDEV_VALIDATE_MAPPING
> > requires both the ipa_base and pa_base which should correspond to the
> > same location. In above scenario, the PA of the first range would
> > correspond to the BAR base whereas the second range would correspond to
> > a location residing after the MSI-X table.
> >
> > Assuming that the report contains obfuscated (but linear) physical
> > addresses, it would be possible to create heuristics for this case.
> > However, the fundamental problem is that none of the "first 4k page"
> > fields in the ranges is guaranteed to correspond to the base of any BAR:
> > Consider a case where the MSI-X table is in the beginning of a BAR and
> > it is followed by a single TEE range. If the MSI-X is not locked, the
> > "first 4k page" field will not correspond to the beginning of the BAR.
> > If the realm naiviely reads the ipa_base using pci_resouce_n() and
> > corresponding pa_base from the interface report, the addresses won't
> > match and the validation will fail.
> >
> > It seems that interpreting the interface report cannot be done without
> > knowledge of the device's register layout. Therefore, I don't think the
> > ranges can be validated/remapped automatically without involving the
> > device driver, but there should be APIs for reading the interface
> > report, and for requesting making specific ranges protected.
> >
>
> But we need to validate the interface report before accepting the device,
> and the device driver is only loaded after the device has been accepted.
>
> Can we assume that only the MSI-X table and PBA ranges may be missing
> from the interface report, while all other non-secure regions are
> reported as NON-TEE ranges?
>
> If so, we could retrieve the MSI-X guest real address details from
> config space and map the beginning of the BAR correctly.
>
> Dan / Yilun — how is this handled in Intel TDX?
>
> From what I can see, the AMD patches appear to encounter the same issue.
Same issue exists for TDX. In the near term this solidifies that the
PCI/TSM core should not be assumining anything with respect to marking
MMIO ranges as private, and leave that all the to low-level TSM driver.
...but then yes I expect we need to build some common infrastructure for
special casing MSIX.
Powered by blists - more mailing lists