[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2bb6e96c-a48d-62ea-90a3-ec978536372f@gmail.com>
Date: Mon, 23 Jul 2018 18:59:06 -0500
From: "Alex G." <mr.nuke.me@...il.com>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>,
Tal Gilboa <talgi@...lanox.com>
Cc: "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"keith.busch@...el.com" <keith.busch@...el.com>,
"alex_gagniuc@...lteam.com" <alex_gagniuc@...lteam.com>,
"austin_bolen@...l.com" <austin_bolen@...l.com>,
"shyam_iyer@...l.com" <shyam_iyer@...l.com>,
"jeffrey.t.kirsher@...el.com" <jeffrey.t.kirsher@...el.com>,
"ariel.elior@...ium.com" <ariel.elior@...ium.com>,
"michael.chan@...adcom.com" <michael.chan@...adcom.com>,
"ganeshgr@...lsio.com" <ganeshgr@...lsio.com>,
Tariq Toukan <tariqt@...lanox.com>,
"airlied@...il.com" <airlied@...il.com>,
"alexander.deucher@....com" <alexander.deucher@....com>,
"mike.marciniszyn@...el.com" <mike.marciniszyn@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5] PCI: Check for PCIe downtraining conditions
On 07/23/2018 05:14 PM, Jakub Kicinski wrote:
> On Tue, 24 Jul 2018 00:52:22 +0300, Tal Gilboa wrote:
>> On 7/24/2018 12:01 AM, Jakub Kicinski wrote:
>>> On Mon, 23 Jul 2018 15:03:38 -0500, Alexandru Gagniuc wrote:
>>>> PCIe downtraining happens when both the device and PCIe port are
>>>> capable of a larger bus width or higher speed than negotiated.
>>>> Downtraining might be indicative of other problems in the system, and
>>>> identifying this from userspace is neither intuitive, nor
>>>> straightforward.
>>>>
>>>> The easiest way to detect this is with pcie_print_link_status(),
>>>> since the bottleneck is usually the link that is downtrained. It's not
>>>> a perfect solution, but it works extremely well in most cases.
>>>>
>>>> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@...il.com>
>>>> ---
>>>>
>>>> For the sake of review, I've created a __pcie_print_link_status() which
>>>> takes a 'verbose' argument. If we agree want to go this route, and update
>>>> the users of pcie_print_link_status(), I can split this up in two patches.
>>>> I prefer just printing this information in the core functions, and letting
>>>> drivers not have to worry about this. Though there seems to be strong for
>>>> not going that route, so here it goes:
>>>
>>> FWIW the networking drivers print PCIe BW because sometimes the network
>>> bandwidth is simply over-provisioned on multi port cards, e.g. 80Gbps
>>> card on a x8 link.
>>>
>>> Sorry to bike shed, but currently the networking cards print the info
>>> during probe. Would it make sense to move your message closer to probe
>>> time? Rather than when device is added. If driver structure is
>>> available, we could also consider adding a boolean to struct pci_driver
>>> to indicate if driver wants the verbose message? This way we avoid
>>> duplicated prints.
>>>
>>> I have no objection to current patch, it LGTM. Just a thought.
>>
>> I don't see the reason for having two functions. What's the problem with
>> adding the verbose argument to the original function?
>
> IMHO it's reasonable to keep the default parameter to what 90% of users
> want by a means on a wrapper. The non-verbose output is provided by
> the core already for all devices.
>
> What do you think of my proposal above Tal? That would make the extra
> wrapper unnecessary since the verbose parameter would be part of the
> driver structure, and it would avoid the duplicated output.
I see how it might make sense to add another member to the driver
struct, but is it worth the extra learning curve? It seems to be
something with the potential to confuse new driver developers, and
having a very marginal benefit.
Although, if that's what people want...
Alex
Powered by blists - more mailing lists