[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZV00ushZBMpkUb02@x130>
Date: Tue, 21 Nov 2023 14:52:42 -0800
From: Saeed Mahameed <saeed@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Saeed Mahameed <saeedm@...dia.com>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jason Gunthorpe <jgg@...dia.com>,
Leon Romanovsky <leonro@...dia.com>,
Jiri Pirko <jiri@...dia.com>, Leonid Bloch <lbloch@...dia.com>,
Itay Avraham <itayavr@...dia.com>,
linux-kernel@...r.kernel.org, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH V3 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl
On 21 Nov 14:10, Jakub Kicinski wrote:
>On Tue, 21 Nov 2023 13:04:06 -0800 Saeed Mahameed wrote:
>> On 21 Nov 12:44, Jakub Kicinski wrote:
>>> On Mon, 20 Nov 2023 23:06:19 -0800 Saeed Mahameed wrote:
>>>> high frequency diagnostic counters
>>>
>>> So is it a debug driver or not a debug driver?
>>
>> High frequency _diagnostic_ counters are a very useful tool for
>> debugging a high performance chip. So yes this is for diagnostics/debug.
>
>You keep saying debugging but if it's expected to run on all servers in
>the fleet _monitoring_ performance, then it's a very different thing.
>To me it certainly moves this driver from "debug thing loaded when
>things fail" to the "always loaded in production" category.
Exactly, only when things fails or the user want to debug something.
For your example, you can monitor network performance via standard netdev
tools, once you start experiencing hiccups, you can use this driver and
the corresponding tools to quickly grab HW debug information, useful to
further root cause and analyze the network hiccups.
Again this is only one use-case, the driver is intended to provide any
debug information, not only diagnostic counters or monitoring tools.
The goal of this driver is not the one use case you have in mind.
Powered by blists - more mailing lists