[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e7fef7fb-df43-4a62-be71-311f9b65fe94@intel.com>
Date: Tue, 22 Apr 2025 15:21:16 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Edward Cree <ecree.xilinx@...il.com>
CC: <netdev@...r.kernel.org>, Jiri Pirko <jiri@...nulli.us>, Jakub Kicinski
<kuba@...nel.org>
Subject: Re: [RFG] sfc: nvlog and devlink health
On 4/16/25 12:24, Edward Cree wrote:
> On 15/04/2025 17:41, Jiri Pirko wrote:
>> Tue, Apr 15, 2025 at 04:51:39PM +0200, ecree.xilinx@...il.com wrote:
>>> DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR is no use here, because it only
>>> clears the kernel-saved copy; it doesn't call any driver method.
>>
>> Can't it be extended to actually call an optional driver method?
>> That would sound fine to me and will solve your problem.
>
> Would that be "diagnose"/"dump clear" or "dump"/"dump clear"?
> The former is weird, are you sure it's not a misuse of the API to
> have "dump clear" clear something that's not a dump? I feel like
> extending the devlink core to support a semantic mismatch /
> layering violation might raise a few eyebrows.
> The latter just doesn't work as (afaict) calling dump twice
> without an intervening clear won't get updated output, and users
> might want to read again without erasing.
>
I guess it is common for HW/FW to have a buffer for errors/events/logs
that could either be cyclical or just stop data collection when full.
We have similar thing for fw health reporter in ice driver (E810),
we simply collect/display also the data from before the driver even
probed (seems valuable).
So, when to clean?
a) clearing the FW log at the point of user-triggered "clear" command,
will easily open up a window to loose events coming after the snapshot
was taken (likely minor issue, as it is typical to care the most about
(some of) the first events only);
b) let the driver to clean (send the "clean" command to FW) on the event
and optionally do the same for .probe
c) newer firmware could implement auto-clean-on-send,
making the b) above a no-op there (but it would be a good fallback for
old fw)
just requiring more actions from user seems too much for the problem,
that seems to be solve-able by the driver
Powered by blists - more mailing lists