[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190102144629.525d7dc1@cakuba.hsd1.ca.comcast.net>
Date:   Wed, 2 Jan 2019 14:46:29 -0800
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Eran Ben Elisha <eranbe@...lanox.com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jiri Pirko <jiri@...lanox.com>,
        Moshe Shemesh <moshe@...lanox.com>,
        Aya Levin <ayal@...lanox.com>, Tal Alon <talal@...lanox.com>,
        Ariel Almog <ariela@...lanox.com>
Subject: Re: [PATCH RFC net-next 00/19] Devlink health reporting and
 recovery system
On Tue, 1 Jan 2019 09:58:30 +0000, Eran Ben Elisha wrote:
> On 1/1/2019 3:47 AM, Jakub Kicinski wrote:
> > The addition of "objdump" and its marshalling is a bit disappointing.
> > It seemed to me when region snapshots were added that they would serve
> > this exact purpose.  Taking a region snapshot or "core dump", if you
> > will, after something went south.  Now it's confusing to have two
> > mechanisms serving the same purpose.  
> 
> The motivation here was that the driver can provide reporters to its 
> sub-modules, such that each reporter will be able to provide all needed 
> info and recover methods to face run time errors.
> 
> The implementation of the objdump function is in the hands of the 
> reporter developer, and he can dump whatever he thinks it is needed.
> Keep in mind that a driver can have many reporters (TX, RX, FW, command 
> interface, etc). For most of the reporters, there is important 
> information in the driver which can not be fetched with region snapshot 
> (intended for memory fetching only).
> For SW info, driver shall build the info and send it interpreted (unlike 
> all dumps / region available mechanisms)
> I have in plans to extend the TX reporter to have objdump method in 
> which I will pass many SQ SW attributes that can be very handy in a 
> debug session.
My feeling is that instead of duplicating this infrastructure we should
try to grow region snapshots beyond the "HW memory dumper".  In a
debugging session you may want to have dumps as well as read the state
live.  Region snapshot API was built with this promise in mind.  The
information in reporter dump is not easily available other than when
the dump happens, which is not great in a debugging session.  Driver
will have to expose it via debugfs/region dump/ethtool dump or some
such for live debug.
> > It's unclear from quick reading of the code how if the buffers get
> > timestamped.  Can you report more than one?  
> 
> The timestamp which devlink health reports on, is the timestamp in which 
> it got the buffers filled by the driver. Every dump/diagnose has one ts.
> Per reporter, it is possible to store up to one dump. only clear command 
> can remove it and makes the reporter ready to fetch a new objdump.
Region snapshots support collecting multiple snapshots IIRC, no?
> > About the marshalling, I'm not sure it belongs in the kernel.  There is
> > precedent for adding interpretation of FW blobs in user space (ethtool).
> > IMHO it's a more scalable approach, if we want to avoid having 100 kLoC
> > drivers.  Amount of code you add to print the simple example from last
> > patch is not inspiring confidence.  
> 
> The idea was to provide the developer the ability to create "tree-like" 
> of information, it is needed when you want to describe complex objects. 
> This caused a longer coding, but I don't think we are even close to the 
> scale you are talking about.
> We can remove the tree flexibility, and move to array format, it will 
> make the code of storing data by the driver to be shorter, but we will 
> lose the ability to have it in tree-like format.
To be clear I slightly oppose the marshalling in the first place.  It
may be better to just dump the data as is, and have user space know
what the interpretation is.
> And again, regarding ethtool, it is a tool for dumping HW/FW, this could 
> have been an argument for the region snapshot, not for the devlink health...
I've seen drivers dumping ring state and other SW info via ethtool.
All debugging APIs end up "mixed source" in my experience.
> > And on the bike shedding side :) -> I think you should steer clear of
> > calling this objdump, as it has very little to do with the
> > functionality of well-known objdump tool.  Its not even clear what the
> > object the name is referring to is.  
> Let's agree on concept, we can change name to dump. Reporter->dump is 
> very clear when you know what the reporter is.
Thanks!
> > Long story short the overlap with region snapshots makes it unclear
> > what purpose either serves, and IMHO we should avoid the marshalling by
> > teaching user space how to interpret snapshots.  Preferably we only
> > have one dump mechanism, and user space can be taught the interpretation
> > once.  
> Forcing SW reporters to use region snapshot mechanism sounds like a bad 
> idea.
I'm not super excited about reusing region API for SW info.  But I like
it more than having multitude of debug APIs for drivers to implement
with largely overlapping functionality and semantics.
> To summarize, In order to have the health system working properly, it 
> must have a way to objdump/dump itself and provide it to the developer 
> for debug. Removing the objdump part will make it useless for run-time 
> debug.
> 
> I think this is a powerful tool and we shall not ask the user level 
> scripts to fetch info from many other partial tools. It shall all be 
> focused in one place (recover, diagnose, objdump, statistics).
I don't think having reporter API refer to snapshot IDs is "many other
partial tools" if that's the suggestion.  Seems like you want to focus
all the reporter stuff in one place, and I want to focus the debug APIs
a little :)
Powered by blists - more mailing lists
 
