[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPgT1u1YO3C3YozC@lstrano-desk.jf.intel.com>
Date: Tue, 21 Oct 2025 16:14:30 -0700
From: Matthew Brost <matthew.brost@...el.com>
To: Jason Gunthorpe <jgg@...pe.ca>
CC: Michał Winiarski <michal.winiarski@...el.com>, "Alex
Williamson" <alex.williamson@...hat.com>, Lucas De Marchi
<lucas.demarchi@...el.com>, Thomas Hellström
<thomas.hellstrom@...ux.intel.com>, Rodrigo Vivi <rodrigo.vivi@...el.com>,
Yishai Hadas <yishaih@...dia.com>, Kevin Tian <kevin.tian@...el.com>,
"Shameer Kolothum" <shameerali.kolothum.thodi@...wei.com>,
<intel-xe@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>,
<kvm@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>, Michal Wajdeczko
<michal.wajdeczko@...el.com>, Jani Nikula <jani.nikula@...ux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>, Tvrtko Ursulin
<tursulin@...ulin.net>, David Airlie <airlied@...il.com>, Simona Vetter
<simona@...ll.ch>, Lukasz Laguna <lukasz.laguna@...el.com>
Subject: Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for
Intel graphics
On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > + /*
> > + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> > + * selectively block p2p DMA transfers.
> > + * The device is not processing new workload requests when the VF is stopped, and both
> > + * memory and MMIO communication channels are transferred to destination (where processing
> > + * will be resumed).
> > + */
> > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> > + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
>
> This comment is not right, RUNNING_P2P means the device can still
> receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> to its framebuffer.
>
> But it is not initiating any new transactions.
>
> > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > +{
> > + struct xe_vfio_pci_core_device *xe_vdev =
> > + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > +
> > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > + return;
> > +
> > + /* vfid starts from 1 for xe */
> > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> > + xe_vdev->pf = pdev->physfn;
>
> No, this has to use pci_iov_get_pf_drvdata, and this driver should
> never have a naked pf pointer flowing around.
>
> The entire exported interface is wrongly formed:
>
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
>
> None of these should be taking in a naked pci_dev, it should all work
> on whatever type the drvdata is.
This seems entirely backwards. Why would the Xe module export its driver
structure to the VFIO module? That opens up potential vectors for
abuse—for example, the VFIO module accessing internal Xe device
structures. In my opinion, it's much cleaner to keep interfaces between
modules as opaque / generic as possible.
Matt
>
> And this gross thing needs to go away too:
>
> > + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
> > + xe_vfio_pci_migration_init(core_vdev);
>
> Jason
Powered by blists - more mailing lists