[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKOOJTwWUCe+6qkderKY7ojfHWDxkMQyQTR6uYRFNiZJ8zzYbw@mail.gmail.com>
Date: Mon, 25 Jan 2021 13:23:04 -0800
From: Edwin Peer <edwin.peer@...adcom.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Parav Pandit <parav@...dia.com>, Saeed Mahameed <saeed@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
netdev <netdev@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Alexander Duyck <alexander.duyck@...il.com>,
Sridhar Samudrala <sridhar.samudrala@...el.com>,
David Ahern <dsahern@...nel.org>,
Kiran Patil <kiran.patil@...el.com>,
Jacob Keller <jacob.e.keller@...el.com>,
"Ertman, David M" <david.m.ertman@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [pull request][net-next V10 00/14] Add mlx5 subfunction support
On Mon, Jan 25, 2021 at 12:41 PM Jason Gunthorpe <jgg@...dia.com> wrote:
> > That's an implementation decision. Nothing mandates that the state has
> > to physically exist in the same structure, only that reads and writes
> > are appropriately responded to.
>
> Yes, PCI does mandate this, you can't store the data on the other side
> of the PCI link, and if you can't cross the PCI link that only leaves
> on die/package memory resources.
Going off device was not what I was suggesting at all. I meant the
data doesn't necessarily need to be stored in the same physical
layout.
Take the config space for example. Many fields are read-only,
constant, always zero (for non-legacy) or reserved. These could be
generated by firmware in response to requests without ever being
backed by physical memory. Similarly, writes that imply a state change
can simply make that state change in whatever internal representation
is convenient for the device. One need only make sure the read back of
that state is appropriately reverse translated from your internal
representation. Similarly, if you're not exposing a bunch of optional
capabilities (the SFs don't), then you don't need the full config
space either, simply render the zeroes in response to the reads where
you have nothing to say.
That's not to say all implementations would be capable of this, only
that it is an implementation choice.
> > Right, but presumably it still needs to be at least a page. And,
> > nothing says your device's VF BAR protocol can't be equally simple.
>
> Having VFs that are not self-contained would require significant
> changing of current infrastructure, if we are going to change things
> then let's fix everything instead of some half measure.
I don't understand what you mean by self-contained. If your device
only needs a doorbell write to trigger a DMA, no reason your VF BAR
needs to expose more. In practice, there will be some kind of
configuration channel too, but this doesn't necessarily need a lot of
room either (you don't have to represent configuration as a bulky
register file exposing every conceivable option, it could be a mailbox
with a command protocol).
> The actual complexity inside the kernel is small and the user
> experience to manage them through devlink is dramatically better than
> SRIOV. I think it is a win even if there isn't any HW savings.
I'm not sure I agree with respect to user experience. Users are
familiar with SR-IOV. Now you impose a complementary model for
accomplishing the same goal (without solving all the problems, as per
the previous discussion, so we'll need to reinvent it again later).
Adds to confusion.
It's not easier for vendors either. Now we need to get users onto new
drivers to exploit it, with all the distribution lags that entails
(where existing drivers would work for SR-IOV). Some vendors will
support it, some won't, further adding to user confusion.
Regards,
Edwin Peer
Download attachment "smime.p7s" of type "application/pkcs7-signature" (4160 bytes)
Powered by blists - more mailing lists