[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be2954f2-e09c-d2ef-c84a-67b8e6fc3967@intel.com>
Date: Tue, 15 Nov 2022 19:59:06 -0600
From: "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
To: Leon Romanovsky <leon@...nel.org>,
Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>
CC: <netdev@...r.kernel.org>, <davem@...emloft.net>, <kuba@...nel.org>,
<pabeni@...hat.com>, <edumazet@...gle.com>,
<intel-wired-lan@...ts.osuosl.org>, <jiri@...dia.com>,
<anthony.l.nguyen@...el.com>, <alexandr.lobakin@...el.com>,
<wojciech.drewek@...el.com>, <lukasz.czapnik@...el.com>,
<shiraz.saleem@...el.com>, <jesse.brandeburg@...el.com>,
<mustafa.ismail@...el.com>, <przemyslaw.kitszel@...el.com>,
<piotr.raczynski@...el.com>, <jacob.e.keller@...el.com>,
<david.m.ertman@...el.com>, <leszek.kaliszczuk@...el.com>
Subject: Re: [PATCH net-next 00/13] resource management using devlink reload
On 11/15/2022 11:57 AM, Leon Romanovsky wrote:
> On Tue, Nov 15, 2022 at 03:02:40PM +0100, Michal Swiatkowski wrote:
>> On Tue, Nov 15, 2022 at 02:12:12PM +0200, Leon Romanovsky wrote:
>>> On Tue, Nov 15, 2022 at 11:16:58AM +0100, Michal Swiatkowski wrote:
>>>> On Tue, Nov 15, 2022 at 11:32:14AM +0200, Leon Romanovsky wrote:
>>>>> On Tue, Nov 15, 2022 at 10:04:49AM +0100, Michal Swiatkowski wrote:
>>>>>> On Tue, Nov 15, 2022 at 10:11:10AM +0200, Leon Romanovsky wrote:
>>>>>>> On Tue, Nov 15, 2022 at 08:12:52AM +0100, Michal Swiatkowski wrote:
>>>>>>>> On Mon, Nov 14, 2022 at 07:07:54PM +0200, Leon Romanovsky wrote:
>>>>>>>>> On Mon, Nov 14, 2022 at 09:31:11AM -0600, Samudrala, Sridhar wrote:
>>>>>>>>>> On 11/14/2022 7:23 AM, Leon Romanovsky wrote:
>>>>>>>>>>> On Mon, Nov 14, 2022 at 01:57:42PM +0100, Michal Swiatkowski wrote:
>>>>>>>>>>>> Currently the default value for number of PF vectors is number of CPUs.
>>>>>>>>>>>> Because of that there are cases when all vectors are used for PF
>>>>>>>>>>>> and user can't create more VFs. It is hard to set default number of
>>>>>>>>>>>> CPUs right for all different use cases. Instead allow user to choose
>>>>>>>>>>>> how many vectors should be used for various features. After implementing
>>>>>>>>>>>> subdevices this mechanism will be also used to set number of vectors
>>>>>>>>>>>> for subfunctions.
>>>>>>>>>>>>
>>>>>>>>>>>> The idea is to set vectors for eth or VFs using devlink resource API.
>>>>>>>>>>>> New value of vectors will be used after devlink reinit. Example
>>>>>>>>>>>> commands:
>>>>>>>>>>>> $ sudo devlink resource set pci/0000:31:00.0 path msix/msix_eth size 16
>>>>>>>>>>>> $ sudo devlink dev reload pci/0000:31:00.0
>>>>>>>>>>>> After reload driver will work with 16 vectors used for eth instead of
>>>>>>>>>>>> num_cpus.
>>>>>>>>>>> By saying "vectors", are you referring to MSI-X vectors?
>>>>>>>>>>> If yes, you have specific interface for that.
>>>>>>>>>>> https://lore.kernel.org/linux-pci/20210314124256.70253-1-leon@kernel.org/
>>>>>>>>>> This patch series is exposing a resources API to split the device level MSI-X vectors
>>>>>>>>>> across the different functions supported by the device (PF, RDMA, SR-IOV VFs and
>>>>>>>>>> in future subfunctions). Today this is all hidden in a policy implemented within
>>>>>>>>>> the PF driver.
>>>>>>>>> Maybe we are talking about different VFs, but if you refer to PCI VFs,
>>>>>>>>> the amount of MSI-X comes from PCI config space for that specific VF.
>>>>>>>>>
>>>>>>>>> You shouldn't set any value through netdev as it will cause to
>>>>>>>>> difference in output between lspci (which doesn't require any driver)
>>>>>>>>> and your newly set number.
>>>>>>>> If I understand correctly, lspci shows the MSI-X number for individual
>>>>>>>> VF. Value set via devlink is the total number of MSI-X that can be used
>>>>>>>> when creating VFs.
>>>>>>> Yes and no, lspci shows how much MSI-X vectors exist from HW point of
>>>>>>> view. Driver can use less than that. It is exactly as your proposed
>>>>>>> devlink interface.
>>>>>>>
>>>>>>>
>>>>>> Ok, I have to take a closer look at it. So, are You saing that we should
>>>>>> drop this devlink solution and use sysfs interface fo VFs or are You
>>>>>> fine with having both? What with MSI-X allocation for subfunction?
>>>>> You should drop for VFs and PFs and keep it for SFs only.
>>>>>
>>>> I understand that MSI-X for VFs can be set via sysfs interface, but what
>>>> with PFs?
>>> PFs are even more tricker than VFs, as you are changing that number
>>> while driver is bound. This makes me wonder what will be lspci output,
>>> as you will need to show right number before driver starts to load.
>>>
>>> You need to present right value if user decided to unbind driver from PF too.
>>>
>> In case of ice driver lspci -vs shows:
>> Capabilities: [70] MSI-X: Enable+ Count=1024 Masked
>>
>> so all vectors that hw supports (PFs, VFs, misc, etc). Because of that
>> total number of MSI-X in the devlink example from cover letter is 1024.
>>
>> I see that mellanox shows:
>> Capabilities: [9c] MSI-X: Enable+ Count=64 Masked
>>
>> I assume that 64 is in this case MSI-X ony for this one PF (it make
>> sense).
> Yes and PF MSI-X count can be changed through FW configuration tool, as
> we need to write new value when the driver is unbound and we need it to
> be persistent. Users are expecting to see "stable" number any time they
> reboot the server. It is not the case for VFs, as they are explicitly
> created after reboots and start "fresh" after every boot.
>
> So we set large enough but not too large value as a default for PFs.
> If you find sane model of how to change it through kernel, you can count
> on our support.
I guess one main difference is that in case of ice, PF driver manager resources
for all its associated functions, not the FW. So the MSI-X count reported for PF
shows the total vectors(PF netdev, VFs, rdma, SFs). VFs talk to PF over a mailbox
to get their MSI-X vector information.
Powered by blists - more mailing lists