[<prev] [next>] [day] [month] [year] [list]
Message-ID: <b948ba7d-74f0-30e5-c4d2-4a4d83866d37@suse.com>
Date: Wed, 5 Jul 2023 06:46:54 +0200
From: Juergen Gross <jgross@...e.com>
To: Oleksandr Tyshchenko <olekstysh@...il.com>,
Roger Pau Monné <roger.pau@...rix.com>,
Stefano Stabellini <sstabellini@...nel.org>,
Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>
Cc: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@...m.com>,
Petr Pavlu <petr.pavlu@...e.com>,
"xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
vikram.garhwal@....com
Subject: Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
On 04.07.23 19:14, Oleksandr Tyshchenko wrote:
>
>
> On Tue, Jul 4, 2023 at 5:49 PM Roger Pau Monné <roger.pau@...rix.com
> <mailto:roger.pau@...rix.com>> wrote:
>
> Hello all.
>
> [sorry for the possible format issues]
>
>
> On Tue, Jul 04, 2023 at 01:43:46PM +0200, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > FWIW, I have ran into this issue some time ago too. I run Xen on top of
> > KVM and then passthrough some of the virtio devices (network one
> > specifically) into a (PV) guest. So, I hit both cases, the dom0 one and
> > domU one. As a temporary workaround I needed to disable
> > CONFIG_XEN_VIRTIO completely (just disabling
> > CONFIG_XEN_VIRTIO_FORCE_GRANT was not enough to fix it).
> > With that context in place, the actual response below.
> >
> > On Tue, Jul 04, 2023 at 12:39:40PM +0200, Juergen Gross wrote:
> > > On 04.07.23 09:48, Roger Pau Monné wrote:
> > > > On Thu, Jun 29, 2023 at 03:44:04PM -0700, Stefano Stabellini wrote:
> > > > > On Thu, 29 Jun 2023, Oleksandr Tyshchenko wrote:
> > > > > > On 29.06.23 04:00, Stefano Stabellini wrote:
> > > > > > > I think we need to add a second way? It could be anything that
> can help
> > > > > > > us distinguish between a non-grants-capable virtio backend and a
> > > > > > > grants-capable virtio backend, such as:
> > > > > > > - a string on xenstore
> > > > > > > - a xen param
> > > > > > > - a special PCI configuration register value
> > > > > > > - something in the ACPI tables
> > > > > > > - the QEMU machine type
> > > > > >
> > > > > >
> > > > > > Yes, I remember there was a discussion regarding that. The point
> is to
> > > > > > choose a solution to be functional for both PV and HVM *and* to
> be able
> > > > > > to support a hotplug. IIRC, the xenstore could be a possible
> candidate.
> > > > >
> > > > > xenstore would be among the easiest to make work. The only downside is
> > > > > the dependency on xenstore which otherwise virtio+grants doesn't have.
> > > >
> > > > I would avoid introducing a dependency on xenstore, if nothing else we
> > > > know it's a performance bottleneck.
> > > >
> > > > We would also need to map the virtio device topology into xenstore, so
> > > > that we can pass different options for each device.
> > >
> > > This aspect (different options) is important. How do you want to pass
> virtio
> > > device configuration parameters from dom0 to the virtio backend domain? You
> > > probably need something like Xenstore (a virtio based alternative like
> virtiofs
> > > would work, too) for that purpose.
> > >
> > > Mapping the topology should be rather easy via the PCI-Id, e.g.:
> > >
> > > /local/domain/42/device/virtio/0000:00:1c.0/backend
> >
> > While I agree this would probably be the simplest to implement, I don't
> > like introducing xenstore dependency into virtio frontend either.
> > Toolstack -> backend communication is probably easier to solve, as it's
> > much more flexible (could use qemu cmdline, QMP, other similar
> > mechanisms for non-qemu backends etc).
>
> I also think features should be exposed uniformly for devices, it's at
> least weird to have certain features exposed in the PCI config space
> while other features exposed in xenstore.
>
> For virtio-mmio this might get a bit confusing, are we going to add
> xenstore entries based on the position of the device config mmio
> region?
>
> I think on Arm PCI enumeration is not (usually?) done by the firmware,
> at which point the SBDF expected by the tools/backend might be
> different than the value assigned by the guest OS.
>
> I think there are two slightly different issues, one is how to pass
> information to virtio backends, I think doing this initially based on
> xenstore is not that bad, because it's an internal detail of the
> backend implementation. However passing information to virtio
> frontends using xenstore is IMO a bad idea, there's already a way to
> negotiate features between virtio frontends and backends, and Xen
> should just expand and use that.
>
>
>
> On Arm with device-tree we have a special bindings which purpose is to inform us
> whether we need to use grants for virtio and backend domid for a particular
> device.Here on x86, we don't have a device tree, so cannot (easily?) reuse this
> logic.
>
> I have just recollected one idea suggested by Stefano some time ago [1]. The
> context of discussion was about what to do when device-tree and ACPI cannot be
> reused (or something like that).The idea won't cover virtio-mmio, but I have
> heard that virtio-mmio usage with x86 Xen is rather unusual case.
>
> I will paste the text below for convenience.
>
> **********
>
> Part 1 (intro):
>
> We could reuse a PCI config space register to expose the backend id.
> However this solution requires a backend change (QEMU) to expose the
> backend id via an emulated register for each emulated device.
>
> To avoid having to introduce a special config space register in all
> emulated PCI devices (virtio-net, virtio-block, etc) I wonder if we
> could add a special PCI config space register at the emulated PCI Root
> Complex level.
>
> Basically the workflow would be as follow:
>
> - Linux recognizes the PCI Root Complex as a Xen PCI Root Complex
> - Linux writes to special PCI config space register of the Xen PCI Root
> Complex the PCI device id (basically the BDF)
> - The Xen PCI Root Complex emulated by Xen answers by writing back to
> the same location the backend id (domid of the backend)
> - Linux reads back the same PCI config space register of the Xen PCI
> Root Complex and learn the relevant domid
>
> Part 2 (clarification):
>
> I think using a special config space register in the root complex would
> not be terrible in terms of guest changes because it is easy to
> introduce a new root complex driver in Linux and other OSes. The root
> complex would still be ECAM compatible so the regular ECAM driver would
> still work. A new driver would only be necessary if you want to be able
> to access the special config space register.
>
>
> **********
> What do you think about it? Are there any pitfalls, etc? This also requires
> system changes, but at least without virtio spec changes.
>
> [1]
> https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2210061747590.3690179@ubuntu-linux-20-04-desktop/ <https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2210061747590.3690179@ubuntu-linux-20-04-desktop/>
Sounds like a good idea. There would be one PCI root per backend domain needed,
but that should be possible.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists