[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ymw8jBoK3Vx8A/uq@nanopsycho>
Date: Fri, 29 Apr 2022 21:29:16 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Ido Schimmel <idosch@...sch.org>, Ido Schimmel <idosch@...dia.com>,
netdev@...r.kernel.org, davem@...emloft.net, pabeni@...hat.com,
jiri@...dia.com, petrm@...dia.com, dsahern@...il.com,
andrew@...n.ch, mlxsw@...dia.com
Subject: Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices
and info
Fri, Apr 29, 2022 at 08:45:35PM CEST, kuba@...nel.org wrote:
>On Fri, 29 Apr 2022 13:51:33 +0200 Jiri Pirko wrote:
>> >Of the three API levels (SDK, automation, human) I think automation
>> >is the only one that's interesting to us in Linux. SDK interfaces are
>> >necessarily too low level as they expose too much of internal details
>> >to standardize. Humans are good with dealing with uncertainty and
>> >diverse so there's no a good benchmark.
>> >
>> >The benchmark for automation is - can a machine use this API across
>> >different vendors to reliably achieve its goals. For FW info/flashing
>> >the goal is keeping the FW versions up to date. This is documented:
>> >
>> >https://www.kernel.org/doc/html/latest/networking/devlink/devlink-flash.html#firmware-version-management
>> >
>> >What would the pseudo code look like with "line cards" in the picture?
>> >Apply RFC1925 truth 12.
>>
>> Something like this:
>>
>> $lc_count = array_size(devlink-lc-info[$handle])
>>
>> for ($lcnum = 0; $lcnum < $lc_count; lcnum++):
>> $dev_count = array_size(devlink-lc-info[$handle][$lcnum])
>>
>> for ($devnum = 0; $devnum < $dev_count; $devnum++):
>
>Here goes the iteration I complained about in my previous message.
>Tracking FW versions makes most sense at the level of a product (as
>in the unit of HW one can purchase from the system vendor). That
>integrates well with system tracking HW in the fleet. Product in your
>case will be a line card or populated chassis, I believe.
Well, most probably, you are right. In theory, there migth de 2 types of
devices/gearboxes on a single line card. I admit I find it unlikely now.
>
>> # Get unique HW design identifier (gearbox id)
>> $hw_id = devlink-lc-info[$handle][$lcnum][$devnum]['fw.psid']
>
>1) you can't use 'fw.psid' in generic logic, it's a Melvidia's invention
>2) looking at your cover letter there's no fw.psid for the device
> reported, the automation will not work, Q.E.D.
We can make is a "symlink" to fw.hw_id or whatever. But that is not the
point here. For ASIC FW, we currently have also fw.psid.
>
>> # Find out which FW flash we want to use for this device
>> $want_flash_vers = some-db-backed.lookup($hw_id, 'flash')
>>
>> # Update flash if necessary
>> if $want_flash_vers != devlink-lc-info[$handle][$lcnum][$devnum]['fw']:
>> $file = some-db-backed.download($hw_id, 'flash')
>> $component = devlink-lc[$handle][$lcnum][$devnum]['component']
>> devlink-dev-flash($handle, $component, $file)
>>
>> devlink-reload()
>>
>> Clear indexes, not squashed somewhere in middle of string.
>>
>> >I thought you said your gearboxes all the the same FW?
>> >Theoretically, yes. Theoretically, I can also have nested "line cards".
>>
>> Well, yeah. I was under impresion that possibility of having multiple
>> devices on the same LC is not close to 0. But I get your point.
>>
>> Let's try to figure out he iface as you want it:
>> We will have devlink dev info extended to be able to provide info
>> about the LC/gearbox. Let's work with same prefix "lcX." for all
>> info related to line card X.
>>
>> First problem is, who is going to enforce this prefix. It is driver's
>> responsibility to provide the string which is put out. The solution
>> would be to have a helper similar to devlink_info_version_*_put()
>> called devlink_info_lc_version_*_put() that would accept lc pointer and
>> add the prefix. Does it make sense to you?
>>
>> We need 3 things:
>> 1) current version of gearbox FW
>> That is easy, we have it - "versions"
>> (DEVLINK_ATTR_INFO_VERSION_* nested attrs). We can have multiple
>> nests that would carry the versions for individiual line cards.
>> Example:
>> versions:
>> fixed:
>> hw.revision 0
>> lc2.hw.revision a
>> lc8.hw.revision b
>> running:
>> ini.version 4
>> lc2.gearbox.fw.version 1.1.3
>> lc8.gearbox.fw.version 1.2.3
>> 2) HW id (as you have it in your pseudocode), it is PSID in case of
>> mlxsw. We already have PSID for ASIC:
>> ....
>> This should be also easy, as this is exposed as "fixed version" in a
>> same way as previous example.
>> Example:
>> versions:
>> fixed:
>> lc2.gearbox.fw.psid XXX
>> lc8.gearbox.fw.psid YYY
>> 3) Component name
>> This one is a bit tricky. It is not a version, so put it under
>> "versions" does not make much sense.
>> Also, there are more than one. Looking at how versions are done,
>> multiple nests of attr type DEVLINK_ATTR_INFO_VERSION_* are put to
>> the message. So we can introduce new attr DEVLINK_ATTR_INFO_FLASH_COMPONENT
>> and put one per linecard/gearbox.
>> Here arain we need some kind of helper to driver to call to put this
>> into msg: devlink_info_lc_flash_component_put()
>> Example:
>> pci/0000:01:00.0:
>> driver mlxsw_spectrum3
>> flash_components:
>> lc2.gearbox.fw
>> lc8.gearbox.fw
>>
>> Then the flashing of the gearbox will be done by:
>> # devlink dev flash pci/0000:01:00.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2 component lc2.gearbox.fw
>
>The main question to me is whether users will want to flash the entire
>device, or update line cards individually.
I think it makes sense to update them individually. The versions are
also reported individually. What's the benefit of not doing that. Also,
how would you name the "group" component. Sounds odd to me.
>
>What's inside mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2? Doesn't
>sound like it's FW just for a single gearbox?
>
>> >Really? So we're going to be designing kernel APIs so that each message
>> >contains complete information and can't contain references now?
>>
>> Can you give me an exapmple of devlink or any other iproute2 app cmd
>> that actually does something similar to this?
>
>Off the top of my head - doesn't ip has some caches for name resolution
>etc.? Either way current code in iproute2.git is hardly written on the
>stone tablets.
Not sure, that is why I thought you might have an example. Anyway, I
don't think it is important. I think that all 3 values exposed I
described above can be just in devlink dev info.
Powered by blists - more mailing lists