[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251002170506.GA3299207@nvidia.com>
Date: Thu, 2 Oct 2025 14:05:06 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: John Hubbard <jhubbard@...dia.com>,
Alexandre Courbot <acourbot@...dia.com>,
Joel Fernandes <joelagnelf@...dia.com>,
Timur Tabi <ttabi@...dia.com>, Alistair Popple <apopple@...dia.com>,
Zhi Wang <zhiw@...dia.com>, Surath Mitra <smitra@...dia.com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Alex Williamson <alex.williamson@...hat.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Krzysztof Wilczyński <kwilczynski@...nel.org>,
Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <lossin@...nel.org>,
Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
nouveau@...ts.freedesktop.org, linux-pci@...r.kernel.org,
rust-for-linux@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/2] rust: pci: skip probing VFs if driver doesn't
support VFs
On Thu, Oct 02, 2025 at 06:05:28PM +0200, Danilo Krummrich wrote:
> On Thu Oct 2, 2025 at 5:23 PM CEST, Jason Gunthorpe wrote:
> > This is not what I've been told, the VF driver has significant
> > programming model differences in the NVIDIA model, and supports
> > different commands.
>
> Ok, that means there are some more fundamental differences between the host PF
> and the "VM PF" code that we have to deal with.
That was my understanding.
> But that doesn't necessarily require that the VF parts of the host have to be in
> nova-core as well, i.e. with the information we have we can differentiate
> between PF, VF and PF in the VM (indicated by a device register).
I'm not entirely sure what you mean by this..
The driver to operate the function in "vGPU" mode as indicated by the
register has to be in nova-core, since there is only one device ID.
> > If you look at the VFIO driver RFC it basically does no mediation, it
> > isn't intercepting MMIO - the guest sees the BARs directly. Most of
> > the code is "profiling" from what I can tell. Some config space
> > meddling.
>
> Sure, there is no mediation in that sense, but it needs quite some setup
> regardless, no?
>
> I thought there is a significant amount of semantics that is different between
> booting the PF and the VF on the host.
I think it would be good to have Zhi clarify more of this, but from
what I understand are at least three activites comingled all together:
1) Boot the PF in "vGPU" mode so it can enable SRIOV
2) Enable SRIOV and profile VFs to allocate HW resources to them
3) VFIO variant driver to convert the VF into a "VM PF" with whatever
mediation and enhancement needed
>From a broad perspective we in the kernel have put #2 outside VFIO
because all of that is actually run through the PF and doesn't use the
VF at all.
#3 is the vfio driver and I would like it if vfio drivers restrained
themselves to focus on the mediation, live migration and things like
that which are directly related to VFIO..
> Also, the idea was to use a layered approach, i.e. let nova-core
> serve as an abstraction layer, where the DRM and VFIO parts can be
> layered on top of.
Yes, I think everyone is good with some version of this..
A big question in my mind is where do you put #2, and what uapi does
it provide. It has to layer on top of nova-core because it has to use
the PF to do profiling.
I'm not a fan of the vfio based sysfs as uAPI for #2, for reasons
touched on in this thread.
NIC drivers are using fwctl and devlink for profiling, managed by the
PF driver. I think I'd want to here reasons why those interfaces
cannot be used here.
Jason
Powered by blists - more mailing lists