netdev - Re: [PATCH v5 net-next 2/8] sfc: add devlink info support for ef100

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y+NQ3tuK6hnDmvah@nanopsycho>
Date:   Wed, 8 Feb 2023 08:35:58 +0100
From:   Jiri Pirko <jiri@...nulli.us>
To:     "Lucero Palau, Alejandro" <alejandro.lucero-palau@....com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-net-drivers (AMD-Xilinx)" <linux-net-drivers@....com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "habetsm.xilinx@...il.com" <habetsm.xilinx@...il.com>,
        "ecree.xilinx@...il.com" <ecree.xilinx@...il.com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "corbet@....net" <corbet@....net>,
        "jiri@...dia.com" <jiri@...dia.com>
Subject: Re: [PATCH v5 net-next 2/8] sfc: add devlink info support for ef100

Tue, Feb 07, 2023 at 06:24:05PM CET, alejandro.lucero-palau@....com wrote:
>
>On 2/7/23 15:10, Lucero Palau, Alejandro wrote:
>> On 2/7/23 14:58, Jiri Pirko wrote:
>>> Tue, Feb 07, 2023 at 03:42:45PM CET, alejandro.lucero-palau@....com wrote:
>>>> On 2/2/23 11:58, Jiri Pirko wrote:
>>>>> Thu, Feb 02, 2023 at 12:14:17PM CET, alejandro.lucero-palau@....com wrote:
>>>>>> From: Alejandro Lucero <alejandro.lucero-palau@....com>
>>>>>>
>>>>>> Support for devlink info command.
>>>>> You are quite brief for couple hundred line patch. Care to shed some
>>>>> more details for the reader? Also, use imperative mood (applies to the
>>>>> rest of the pathes)
>>>>>
>>>>> [...]
>>>>>
>>>> OK. I'll be more talkative and imperative here.
>>>>
>>>>>> +static int efx_devlink_info_get(struct devlink *devlink,
>>>>>> +				struct devlink_info_req *req,
>>>>>> +				struct netlink_ext_ack *extack)
>>>>>> +{
>>>>>> +	struct efx_devlink *devlink_private = devlink_priv(devlink);
>>>>>> +	struct efx_nic *efx = devlink_private->efx;
>>>>>> +	char msg[NETLINK_MAX_FMTMSG_LEN];
>>>>>> +	int errors_reported = 0;
>>>>>> +	int rc;
>>>>>> +
>>>>>> +	/* Several different MCDI commands are used. We report first error
>>>>>> +	 * through extack along with total number of errors. Specific error
>>>>>> +	 * information via system messages.
>>>>>> +	 */
>>>>>> +	rc = efx_devlink_info_board_cfg(efx, req);
>>>>>> +	if (rc) {
>>>>>> +		sprintf(msg, "Getting board info failed");
>>>>>> +		errors_reported++;
>>>>>> +	}
>>>>>> +	rc = efx_devlink_info_stored_versions(efx, req);
>>>>>> +	if (rc) {
>>>>>> +		if (!errors_reported)
>>>>>> +			sprintf(msg, "Getting stored versions failed");
>>>>>> +		errors_reported += rc;
>>>>>> +	}
>>>>>> +	rc = efx_devlink_info_running_versions(efx, req);
>>>>>> +	if (rc) {
>>>>>> +		if (!errors_reported)
>>>>>> +			sprintf(msg, "Getting board info failed");
>>>>>> +		errors_reported++;
>>>>> Under which circumstances any of the errors above happen? Is it a common
>>>>> thing? Or is it result of some fatal event?
>>>> They are not common at all. If any of those happen, it is a bad sign,
>>>> and it is more than likely there are more than one because something is
>>>> not working properly. That is the reason I only report first error found
>>>> plus the total number of errors detected.
>>>>
>>>>
>>>>> You treat it like it is quite common, which seems very odd to me.
>>>>> If they are rare, just return error right away to the caller.
>>>> Well, that is done now. And as I say, I'm not reporting all but just the
>>>> first one, mainly because the buffer limitation with NETLINK_MAX_FMTMSG_LEN.
>>>>
>>>> If errors trigger, a more complete information will appear in system
>>>> messages, so that is the reason with:
>>>>
>>>> +               NL_SET_ERR_MSG_FMT(extack,
>>>> +                                  "%s. %d total errors. Check system messages",
>>>> +                                  msg, errors_reported);
>>>>
>>>> I guess you are concerned with the extack report being overwhelmed, but
>>>> I do not think that is the case.
>>> No, I'm wondering why you just don't put error message into exack and
>>> return -ESOMEERROR right away.
>> Well, I thought the idea was to give more information to user space
>> about the problem.
>>
>> Previous patchsets were not reporting any error nor error information
>> through extack. Now we have both.
>
>
>Just trying to make more sense of this.
>
>Because that limit with NETLINK_MAX_FMTMSG_LEN, what I think is big 
>enough, some control needs to be taken about what to report. It could be 
>just to write the buffer with the last error and report that last one 

Wait. My point is: fail on the first error returning the error to
info_get() caller. Just that. No accumulation of anything.


>only, with no need of keeping total errors count. But I felt once we 
>handle any error, reporting that extra info about the total errors 
>detected should not be a problem at all, even if it is an unlikely 
>situation.
>
>BTW, I said we were reporting both, the error and the extack error 
>message, but I've realized the function was not returning any error but 
>always 0, so I'll fix that.
>
>
>>>>>> +	}
>>>>>> +
>>>>>> +	if (errors_reported)
>>>>>> +		NL_SET_ERR_MSG_FMT(extack,
>>>>>> +				   "%s. %d total errors. Check system messages",
>>>>>> +				   msg, errors_reported);
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> static const struct devlink_ops sfc_devlink_ops = {
>>>>>> +	.info_get			= efx_devlink_info_get,
>>>>>> };
>>>>> [...]