[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3acvyonpwd6eejk6ka2vmkorggtnohc6vfagzix5xkx4jru6o@kf3q3hvasgtx>
Date: Tue, 15 Apr 2025 18:41:45 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Edward Cree <ecree.xilinx@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: [RFG] sfc: nvlog and devlink health
Tue, Apr 15, 2025 at 04:51:39PM +0200, ecree.xilinx@...il.com wrote:
>Solarflare NICs have a flash partition to which the MCPU logs various
> errors, warnings, and other diagnostic info. We want to expose this
> 'nvlog' data, and the best fit we've found so far is devlink health.
>Reading it is simple enough — plan is to have a reporter whose diagnose
> method reads the partition and returns the contents (could potentially
> use dump instead but the extra layer of triggering and saving seems
> unnecessary).
>The problem is how to clear it (since it fills up after comparatively
> few boots, so when debugging field issues you'll usually need to clear
> it first and then reproduce the issue).
> DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR is no use here, because it only
> clears the kernel-saved copy; it doesn't call any driver method.
Can't it be extended to actually call an optional driver method?
That would sound fine to me and will solve your problem.
>The code we've developed internally, that I'm now preparing to submit
> upstream, handles this by having *two* reporters, 'nvlog' and
> 'nvlog-clear'; both read the flash in their diagnose method but
> nvlog-clear additionally clears it afterwards. It works, but it
> doesn't feel very clean.
>Is this approach acceptable? Is there a better way?
>
>-ed
>
Powered by blists - more mailing lists