[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <69727e7ded712_3095100ab@dwillia2-mobl4.notmuch>
Date: Thu, 22 Jan 2026 11:46:05 -0800
From: <dan.j.williams@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>, <dan.j.williams@...el.com>
CC: Jonathan Cameron <jonathan.cameron@...wei.com>, "Tian, Kevin"
<kevin.tian@...el.com>, Nicolin Chen <nicolinc@...dia.com>, "will@...nel.org"
<will@...nel.org>, "robin.murphy@....com" <robin.murphy@....com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "joro@...tes.org"
<joro@...tes.org>, "praan@...gle.com" <praan@...gle.com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"miko.lenczewski@....com" <miko.lenczewski@....com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, <linux-cxl@...r.kernel.org>
Subject: Re: [PATCH RFCv1 1/3] PCI: Allow ATS to be always on for CXL.cache
capable devices
Jason Gunthorpe wrote:
> On Wed, Jan 21, 2026 at 09:44:32PM -0800, dan.j.williams@...el.com wrote:
> > Jason Gunthorpe wrote:
> > > On Wed, Jan 21, 2026 at 10:03:07AM +0000, Jonathan Cameron wrote:
> > > > On Wed, 21 Jan 2026 08:01:36 +0000
> > > > "Tian, Kevin" <kevin.tian@...el.com> wrote:
> > > >
> > > > > +Dan. I recalled an offline discussion in which he raised concern on
> > > > > having the kernel blindly enable ATS for cxl.cache device instead of
> > > > > creating a knob for admin to configure from userspace (in case
> > > > > security is viewed more important than functionality, upon allowing
> > > > > DMA to read data out of CPU caches)...
> > > > >
> > > >
> > > > +CC Linux-cxl
> > >
> > > A cxl.cache device supporting ATS will automatically enable ATS today
> > > if the kernel option to enable translation is set.
> > >
> > > Even if the device is marked untrusted by the PCI layer (eg an
> > > external port).
> > >
> > > Yes this is effectively a security issue, but it is not really a CXL
> > > specific problem.
> >
> > My contention is that it is a worse or at least different problem in the
> > CXL case because now you have a new toolkit in an attack that wants to
> > exfiltrate data from CPU caches.
>
> ?? I don't see CXL as meaningfully different than PCI in terms of what
> data can be accessed with Translated requests. If the IOMMU doesn't
> block Translated requests the whole systems is open. CXL doesn't make
> it more open.
Right, the game is mostly over in the current case, but CXL.cache still
deserves to be treated carefully. Consider a world where we do have limitations
against requests to HPAs that were never translated for the device. In that
scenario the device can help side channel the contents of HPAs it does not
otherwise have access by messing with aliased lines it does have access.
At a minimum CXL.cache is not improving the security story, so no time like the
present to put a policy mechanism in place that improves upon the PCI untrusted
flag.
> > "We have a less than perfect legacy way (PCI untrusted flag) to nod at
> > ATS security problems. Let us ignore even that for a new class of
> > devices that advertise they can trigger all the old security problems
> > plus new ones."
>
> Ah, I missed that we are already force disabling ATS in this untrusted
> case, so we should ensure that continues to be the case here
> too. Nicolin does it need a change?
>
> > I do not immediately see what is wrong with requiring userspace policy
> > opt-in. That naturally gets replaced by installing the device's
> > certificate (for native PCI CMA), authenticating the device with the
> > TSM (for PCI IDE), or obviated by secure-ATS if that arrives.
>
> I think that goes back to the discussion about not loading drivers
> before validating the device.
>
> It would also make alot of sense to leave the IOMMU blocking until the
> driver is loaded for these secure situations. The blocking translation
> should block ATS too.
>
> Then the flow you are describing will work well:
>
> 1) At pre-boot the IOMMU will block all DMA including Translated.
> 2) The OS activates the IOMMU driver and keeps blocking.
> 3) Instead of immediately binding a default domain the IOMMU core
> leaves the translation blocking.
> 4) The OS defers loading the driver to userspace.
> 5) Userspace measures the device and "accepts" it by loading the
> driver
> 6) IOMMU core attaches a non-blocking default domain and activates ATS
That works for me. Give the paranoid the ability to have a point where they can
be assured that the shields were not lowered prematurely.
Powered by blists - more mailing lists