[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DEFKSDBXEQS0.3C79SP3HEG1OY@kernel.org>
Date: Sun, 23 Nov 2025 11:26:52 +1300
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Leon Romanovsky" <leon@...nel.org>
Cc: "Jason Gunthorpe" <jgg@...pe.ca>, "Peter Colberg" <pcolberg@...hat.com>,
"Bjorn Helgaas" <bhelgaas@...gle.com>,
Krzysztof Wilczyński <kwilczynski@...nel.org>, "Miguel
Ojeda" <ojeda@...nel.org>, "Alex Gaynor" <alex.gaynor@...il.com>, "Boqun
Feng" <boqun.feng@...il.com>, "Gary Guo" <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>, "Benno Lossin"
<lossin@...nel.org>, "Andreas Hindborg" <a.hindborg@...nel.org>, "Alice
Ryhl" <aliceryhl@...gle.com>, "Trevor Gross" <tmgross@...ch.edu>, "Abdiel
Janulgue" <abdiel.janulgue@...il.com>, "Daniel Almeida"
<daniel.almeida@...labora.com>, "Robin Murphy" <robin.murphy@....com>,
"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>, "Dave Ertman"
<david.m.ertman@...el.com>, "Ira Weiny" <ira.weiny@...el.com>,
<linux-pci@...r.kernel.org>, <rust-for-linux@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, "Alexandre Courbot" <acourbot@...dia.com>,
"Alistair Popple" <apopple@...dia.com>, "Joel Fernandes"
<joelagnelf@...dia.com>, "John Hubbard" <jhubbard@...dia.com>, "Zhi Wang"
<zhiw@...dia.com>
Subject: Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF
device
On Sun Nov 23, 2025 at 7:57 AM NZDT, Leon Romanovsky wrote:
> On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
>> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
>> > So far I haven't heard a convincing reason for not providing this guarantee. The
>> > only reason not to guarantee this I have heard is that some PF drivers only
>> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
>> > no strong reason to do so.
>>
>> I know some people have work flows around this. I think they are
>> wrong. When we "fixed" mlx5 a while back there was some pushback and
>> some weird things did stop working.
>
> I don't think that this is correct. The main use case for keeping VFs,
> while unloading PF driver is to allow reuse this PF for VFIO too. It is
> very natural and common flow for "real" SR-IOV devices which present all
> PF/VFs as truly separate devices.
That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
VFIO"? What do you do with the PF after driver unload instead? Load another
driver? If so, why separate ones?
>> So while I support this goal, I don't know if enough people will
>> agree..
>
> I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
> talked to get rationale for this new behaviour.
Not sure what you mean, Bjorn is CC'd on the whole series and hence also this
conversation.
Otherwise, my proposal to guarantee that the PF is bound as long as the VF(s)
are bound stems from the corresponding beneficial lifecycle implications for the
driver model (some more on this below).
>> > With this, the above code will be correct and a driver can use the generic
>> > infrastructure to:
>> >
>> > 1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
>> > 2) Grab the driver's device private data from the returned Device<Bound>
>> >
>> > Note that 2) (i.e. accessing the driver's device private data with
>> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
>> > only implemented for Device<Bound>) and that the returned type actually matches
>> > the type of the object that has been stored.
>>
>> This is what the core helper does, with the twist that it "validates"
>> the PF driver is working right by checking its driver binding..
>>
>> > I suggest to have a look at [2] for an example with how this turns out with the
>> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
>> > driver calling into its parent, except that the auxiliary bus already guarantees
>> > that the parent is bound when the child is bound.
>>
>> Aux bus is a little different because it should be used in a way where
>> there are stronger guarantees about what the parent is. Ie the aux bus
>> names should be unique and limited to the type of parent.
>
> Right, aux bus is completely unrelated to real physical PCI bus.
Of course the auxiliary and the PCI bus are fundamentally different busses,
*but* they also have commonalities: the driver lifecycle requirements drivers
have to deal with are the same.
For instance, if you call from an auxiliary driver into the parent driver to let
the parent driver do some work on behalf, the auxiliary driver has to guarantee
that the parent device is guranteed to remain bound for the duration of the
call, otherwise the parent driver can't call dev_get_drvdata() without risking a
UAF, since a concurrent unbind might free the parent driver's device private
data.
The same is true for PCI when a VF driver calls into the PF driver to do some
work on behalf. The VF driver has to make sure that the PF driver remains bound
for the whole duration of the call for the exact same reasons.
I.e. it is a common driver lifecycle pattern for interactions between drivers in
general.
As for Rust specifically, we can actually model such driver lifecycle patterns
with the type system and with that do a lot of the driver lifecycle checks (that
drivers love to ignore :) even at compile time, such that your code doesn't even
compile when you violate the driver lifecycle rules.
For the auxiliary bus that's already the case and I'd love to get to this point
with PF <-> VF interactions (in Rust) as well.
I hope this helps.
- Danilo
Powered by blists - more mailing lists