[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8a7ecc3f-32c8-42fb-b814-9bb12d53e29b@nvidia.com>
Date: Thu, 30 Oct 2025 16:48:54 +0200
From: Gal Pressman <gal@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>, Matthew W Carlis <mattc@...estorage.com>
Cc: adailey@...estorage.com, ashishk@...estorage.com, mbloch@...dia.com,
msaggi@...estorage.com, netdev@...r.kernel.org, saeedm@...dia.com,
tariqt@...dia.com
Subject: Re: [PATCH 1/1] net/mlx5: query_mcia_reg fail logging at debug
severity
On 30/10/2025 1:33, Jakub Kicinski wrote:
> On Wed, 29 Oct 2025 10:49:24 -0600 Matthew W Carlis wrote:
>> On Wed, 29 Oct 2025, Gal Pressman wrote:
>>> Allow me to split the discussion to two questions:
>>> 1. Is this an error?
>>> 2. Should it be logged?
>>>
>>> Do we agree that the answer to #1 is yes?
>>>
>>> For #2, I think it should, but we can probably improve the situation
>>> with extack instead of a print.
>>
>> I think its an 'expected error' if the module is not present. I agree.
>>
>> For 2 I think if the user runs "ethtool -m" on a port with no module,
>> they received an error message stating something along the lines of
>> "module not present" and the kernel didn't have any log messages about
>> it that would be near to 'the best' solution.
>
> I assume you mean error message specifically from the CLI or whatever
> API the user is exercising? If so I agree.
>
> The system logs are for fatal / unexpected conditions. AFAIU returning
> -EIO is the _expected_ way to find out that module is not plugged in.
> If there's a better API I suppose we can make ethtool call it first
> to avoid the error.
There are other cases which will return -EIO, but do not necessarily
mean that the module is disconnected: unresponsive module, i2c error,
disabled module, unsupported module, etc.
We cannot differentiate between them without the status print.
Changing the log level makes things more difficult, as most production
servers will not enable the debug print, and the logs would be harder to
analyze.
I asked before, maybe these automatic tools that keep querying the
module continue to do so because of the success return code, and that
will be resolved soon?
Powered by blists - more mailing lists