[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UfAoGXQp9C0uL124GZfdhY6vvpk3NmCDqCpLET9dzAdRg@mail.gmail.com>
Date: Fri, 15 Jan 2021 20:32:19 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Leon Romanovsky <leon@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Saeed Mahameed <saeedm@...dia.com>,
Jakub Kicinski <kuba@...nel.org>,
linux-pci <linux-pci@...r.kernel.org>,
linux-rdma@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
Don Dutile <ddutile@...hat.com>
Subject: Re: [PATCH mlx5-next v1 2/5] PCI: Add SR-IOV sysfs entry to read
number of MSI-X vectors
On Fri, Jan 15, 2021 at 6:06 AM Jason Gunthorpe <jgg@...dia.com> wrote:
>
> On Thu, Jan 14, 2021 at 05:56:20PM -0800, Alexander Duyck wrote:
>
> > That said, it only works at the driver level. So if the firmware is
> > the one that is having to do this it also occured to me that if this
> > update happened on FLR that would probably be preferred.
>
> FLR is not free, I'd prefer not to require it just for some
> philosophical reason.
It wasn't so much a philosophical thing as the fact that it can sort
of take the place as a reload. Essentially with an FLR we are
rewriting the configuration so if the driver were involved it would be
a good time to pull in the MSI-X table update. However looking over
the mlx5 code I don't see any handling of FLR in there so I am
assuming that is handled by the firmware.
> > Since the mlx5 already supports devlink I don't see any reason why the
> > driver couldn't be extended to also support the devlink resource
> > interface and apply it to interrupts.
>
> So you are OK with the PF changing the VF as long as it is devlink not
> sysfs? Seems rather arbitary?
It is about the setup of things. The sysfs existing in the VF is kind
of ugly since it is a child device calling up to the parent and
telling it how it is supposed to be configured. I'm sure in theory we
could probably even have the VF request something like that itself
through some sort of mailbox and cut out the middle-man but that would
be even uglier.
If you take a look at the usage of pci_physfn it is usually in spots
where the PF is being looked for in order to find the policy that is
supposed to be applied to the VF. This is one of the first few cases
where it is being used to set the policy for the VF.
> Leon knows best, but if I recall devlink becomes wonky when the VF
> driver doesn't provide a devlink instance. How does it do reload of a
> VF then?
In my mind it was the PF driver providing a devlink instance for the
VF if a driver isn't loaded. Then if the mlx5 driver was loaded on the
VF you would replace that instance with one supported by the VF itself
in order to coordinate with the VF driver. That way if the mlx5 driver
is loaded on the VF you can still change the settings instead of being
blocked by your own driver.
As far as a reload the non-driver loaded case would probably look a
lot like how things are handled now with the taking of the device
lock, verifying no driver is loaded, notifying the firmware, and
releasing the lock when it is complete. If the mlx5 driver is loaded
on the VF it could be a more complete setup that would probably look
more like your standard driver reinit.
> I think you end up with essentially the same logic as presented here
> with sysfs.
It is similar, however the big difference is how the control is setup.
With the VF sysfs file running things it feels sort of like the tail
wagging the dog. You are having to go through and verify that this is
a VF, that the PF is present, that the PF supports this operation and
so on. If the PF is in charge of managing the configuration it should
be the one registering the interfaces, not the VF. That is my view on
this anyway as I feel it simplifies this quite a bit as the interface
won't be there if it isn't supported.:
> > > It is possible for vfio to fake the MSI-X capability and limit what a
> > > user can access, but I don't think that's what is being done here.
> >
> > Yeah, I am assuming that is what is being done here.
>
> Just to be really clear, that assumption is wrong
I misspoke and meant to agree with Alex's comment. If you are saying I
was wrong, then yes, I was wrong. I meant that I was assuming you were
resizing the actual table in the MMIO region where the MSI-X table and
PBA bits are present.
Powered by blists - more mailing lists