[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190313091731.76129ece@cakuba.attlocal.net>
Date: Wed, 13 Mar 2019 09:17:31 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Jiri Pirko <jiri@...nulli.us>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
On Wed, 13 Mar 2019 07:07:01 +0100, Jiri Pirko wrote:
> Tue, Mar 12, 2019 at 09:56:28PM CET, jakub.kicinski@...ronome.com wrote:
> >On Tue, 12 Mar 2019 15:02:39 +0100, Jiri Pirko wrote:
> >> Tue, Mar 12, 2019 at 03:10:54AM CET, wrote:
> >> >On Mon, 11 Mar 2019 09:52:04 +0100, Jiri Pirko wrote:
> >> >> Fri, Mar 08, 2019 at 08:09:43PM CET, wrote:
> >> >> >If the switchport is in the hypervisor then only the hypervisor can
> >> >> >control switching/forwarding, correct?
> >> >>
> >> >> Correct.
> >> >>
> >> >> >The primary use case for partitioning within a VM (of a VF) would be
> >> >> >containers (and DPDK)?
> >> >>
> >> >> Makes sense.
> >> >>
> >> >> >SR-IOV makes things harder. Splitting a PF is reasonably easy to grasp.
> >> >> >I'm trying to get a sense of is how would we control an SR-IOV
> >> >> >environment as a whole.
> >> >>
> >> >> You mean orchestration?
> >> >
> >> >Right, orchestration.
> >> >
> >> >To be clear on where I'm going with this - if we want to allow VFs
> >> >to partition themselves then they have to control what is effectively
> >> >a "nested" switch. A per-VF set of rules which would the get
> >>
> >> Wait. If you allow to make VF subports (I believe that is what you ment
> >> by VFs partition themselves), that does not mean they will have a
> >> separate nested switch. They would still belong under the same one.
> >
> >But that existing switch is administered by the hypervisor, how would
> >the VF owners install forwarding rules in a switch they don't control?
>
> They won't.
Argh. So how is forwarding configured if there are no rules? Are you
going to assume its switching on MACs? We're supposed to offload
software constructs. If its a software port it needs to be explicitly
switched. If it's not explicitly switched - we already have macvlan
offload.
> >> >"flattened" into the main eswitch rule set. If I was to choose I'd
> >> >really rather have this "flattening" be done on the (Linux) hypervisor
> >> >and not in the vendor driver and firmware.
> >>
> >> Agreed. Driver should provide one big switch. User should configure it.
> >
> >Cool, when you say user - is it the tenant or the provider?
>
> Whoever gets access to the instance.
>
> >> >I'd much rather have the VM make a "give me another NIC" orchestration
> >> >call via some high level REST API than devlink. This makes the
> >> >configuration strictly high level to low level:
> >> >
> >> > VM -> cloud net REST API -> cloud agent -> devlink/Linux -> FW -> HW
> >> >
> >> >Without round trips via firmware.
> >>
> >> Okay. So the "devlink/Linux -> FW" part is going to happen on baremetal.
> >> Makes sense.
> >>
> >> >This allows for easy policy enforcement, common code to be maintained
> >> >in user space, in high level languages (no 0.5M LoC drivers and 10M LoC
> >> >firmware for every driver). It can also be used with software paths
> >> >like VirtIO..
> >>
> >> Agreed.
> >>
> >> >Modelling and debugging a nested switch would be a nightmare. What
> >> >follows is that we probably shouldn't deal with partitioning of VFs,
> >> >but rather only partition via the PF devlink instance, and reassign
> >> >the partitions to VMs.
> >>
> >> Agreed. That must be misunderstanding, I never suggested nested
> >> switches.
> >
> >Cool, yes, I was making sure we weren't going in that direction :)
>
> Okay.
>
> >> >> I originally planned to implement sriov orchestration api in devlink too.
> >> >
> >> >Interesting, would you mind elaborating?
> >>
> >> I have to think about it. But something like this:
> >> [...]
> >
> >I see thanks for the examples, they makes things clear!
>
> Okay. I will put together some documentation including this. I have some
> patches that implement some of the stuff. Your patchset also does some
> of that (considering you adjust a thing or two). Lets make this right.
Yeah, I feel like I'm again getting further from clarity on what you're
trying to achieve.
Powered by blists - more mailing lists