[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4bb2edcf-e2f3-d396-cc1c-e36b427bdd17@intel.com>
Date: Wed, 30 May 2018 20:35:06 -0700
From: "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc: Michael Chan <michael.chan@...adcom.com>,
David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>,
Or Gerlitz <gerlitz.or@...il.com>
Subject: Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF
minimum and maximum queues.
On 5/30/2018 3:53 PM, Jakub Kicinski wrote:
> On Wed, 30 May 2018 14:23:06 -0700, Samudrala, Sridhar wrote:
>> On 5/29/2018 11:33 PM, Jakub Kicinski wrote:
>>> On Tue, 29 May 2018 23:08:11 -0700, Michael Chan wrote:
>>>> On Tue, May 29, 2018 at 10:56 PM, Jakub Kicinski wrote:
>>>>> On Tue, 29 May 2018 20:19:54 -0700, Michael Chan wrote:
>>>>>> On Tue, May 29, 2018 at 1:46 PM, Samudrala, Sridhar wrote:
>>>>>>> Isn't ndo_set_vf_xxx() considered a legacy interface and not planned to be
>>>>>>> extended?
>>>>> +1 it's painful to see this feature being added to the legacy
>>>>> API :( Another duplicated configuration knob.
>>>>>
>>>>>> I didn't know about that.
>>>>>>
>>>>>>> Shouldn't we enable this via ethtool on the port representor netdev?
>>>>>> We discussed about this. ethtool on the VF representor will only work
>>>>>> in switchdev mode and also will not support min/max values.
>>>>> Ethtool channel API may be overdue a rewrite in devlink anyway, but I
>>>>> feel like implementing switchdev mode and rewriting features in devlink
>>>>> may be too much to ask.
>>>> Totally agreed. And switchdev mode doesn't seem to be that widely
>>>> used at the moment. Do you have other suggestions besides NDO?
>>> At some points you (Broadcom) were working whole bunch of devlink
>>> configuration options for the PCIe side of the ASIC. The number of
>>> queues relates to things like number of allocated MSI-X vectors, which
>>> if memory serves me was in your devlink patch set. In an ideal world
>>> we would try to keep all those in one place :)
>>>
>>> For PCIe config there is always the question of what can be configured
>>> at runtime, and what requires a HW reset. Therefore that devlink API
>>> which could configure current as well as persistent device settings was
>>> quite nice. I'm not sure if reallocating queues would ever require
>>> PCIe block reset but maybe... Certainly it seems the notion of min
>>> queues would make more sense in PCIe configuration devlink API than
>>> ethtool channel API to me as well.
>>>
>>> Queues are in the grey area between netdev and non-netdev constructs.
>>> They make sense both from PCIe resource allocation perspective (i.e.
>>> devlink PCIe settings) and netdev perspective (ethtool) because they
>>> feed into things like qdisc offloads, maybe per-queue stats etc.
>>>
>>> So yes... IMHO it would be nice to add this to a devlink SR-IOV config
>>> API and/or switchdev representors. But neither of those are really an
>>> option for you today so IDK :)
>> One reason why 'switchdev' mode is not yet widely used or enabled by default
>> could be due to the requirement to program the flow rules only via slow path.
> Do you mean the fallback traffic requirement?
Yes.
>
>> Would it make sense to relax this requirement and support a mode where port
>> representors are created and let the PF driver implement a default policy that
>> adds flow rules for all the VFs to enable connectivity and let the user
>> add/modify the rules via port representors?
> I definitely share your concerns, stopping a major HW vendor from using
> this new and preferred mode is not helping us make progress.
>
> The problem is that if we allow this diversion, i.e. driver to implement
> some special policy, or pre-populate a bridge in a configuration that
> suits the HW we may condition users to expect that as the standard Linux
> behaviour. And we will be stuck with it forever even tho your next gen
> HW (ice?) may support correct behaviour.
Yes. ice can support slowpath behavior as required to support OVS offload.
However, i was just wondering if we should have an option to allow switchdev
without slowpath so that the user can use alternate mechanisms to program
the flow rules instead of having to use OVS.
>
> We should perhaps separate switchdev mode from TC flower/OvS offloads.
> Is your objective to implement OvS offload or just switchdev mode?
>
> For OvS without proper fallback behaviour you may struggle.
>
> Switchdev mode could be within your reach even without changing the
> default rules. What if you spawned all port netdevs (I dislike the
> term representor, sorry, it's confusing people) in down state and then
> refuse to bring them up unless user instantiated a bridge that would
> behave in a way that your HW can support? If ports are down you won't
> have fallback traffic so no problem to solve.
If we want to use port netdev's admin state to control the link state of the
VFs then this will not work.
We need to only disable TX/RX but admin state and link state need to be
supported on the port netdevs.
Powered by blists - more mailing lists