lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 4 Jun 2024 17:07:39 +0200
From: Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To: Johan Hovold <johan@...nel.org>
Cc: Johan Hovold <johan+linaro@...nel.org>, Vinod Koul <vkoul@...nel.org>,
 Bard Liao <yung-chuan.liao@...ux.intel.com>,
 Sanyog Kale <sanyog.r.kale@...el.com>, alsa-devel@...a-project.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/4] soundwire: bus: clean up probe warnings

\
>>>>> @@ -123,7 +123,7 @@ static int sdw_drv_probe(struct device *dev)
>>>>>  	/* init the dynamic sysfs attributes we need */
>>>>>  	ret = sdw_slave_sysfs_dpn_init(slave);
>>>>>  	if (ret < 0)
>>>>> -		dev_warn(dev, "Slave sysfs init failed:%d\n", ret);
>>>>> +		dev_warn(dev, "failed to initialise sysfs: %d\n", ret);
>>>>>  
>>>>>  	/*
>>>>>  	 * Check for valid clk_stop_timeout, use DisCo worst case value of
>>>>> @@ -147,7 +147,7 @@ static int sdw_drv_probe(struct device *dev)
>>>>>  	if (drv->ops && drv->ops->update_status) {
>>>>>  		ret = drv->ops->update_status(slave, slave->status);
>>>>>  		if (ret < 0)
>>>>> -			dev_warn(dev, "%s: update_status failed with status %d\n", __func__, ret);
>>>>> +			dev_warn(dev, "failed to update status: %d\n", ret);
>>>>
>>>> the __func__ does help IMHO, 'failed to update status' is way too general...
>>>
>>> Error messages printed with dev_warn will include the device and driver
>>> names so this message will be quite specific still.
>>
>> The goal isn't to be 'quite specific' but rather 'completely
>> straightforward'. Everyone can lookup a function name in a xref tool and
>>  quickly find out what happened. Doing 'git grep' on message logs isn't
>> great really, and over time logs tend to be copy-pasted. Just look at
>> the number of patches where we had to revisit the dev_err logs to make
>> then really unique/useful.
> 
> Error message should be self-contained and give user's some idea of what
> went wrong and not leak implementation details like function names (and
> be greppable, which "%s:" is not).

"Failed to update status" doesn't sound terribly self-contained to me.

It's actually a great example of making the logs less clear with good
intentions. How many people know that the SoundWire bus exposes an
'update_status' callback, and that callback can be invoked from two
completely different places (probe or on device attachment)?

/* Ensure driver knows that peripheral unattached */
ret = sdw_update_slave_status(slave, status[i]);
if (ret < 0)
	dev_warn(&slave->dev, "Update Slave status failed:%d\n", ret);

You absolutely want to know which of these two cases failed, but with
your changes they now look rather identical except for the order of
words. one would be 'failed to update status' and the other 'update
status failed'.

What is much better is to know WHEN this failure happens, then folks
looking at logs to fix a problem don't need to worry about precise
wording or word order.

It's a constant battle to get meaningful messages that are useful for
validation/integration folks, and my take is that it's a
windmill-fighting endeavor. The function name is actually more useful,
it's not an implementation detail, it's what you're looking for when
reverse-engineering problematic sequences from a series of CI logs.

>>>> Replacing 'with status' by ":" is fine, but do we really care about 10
>>>> chars in a log?
>>>
>>> It's not primarily about the numbers of characters but about consistency.
>>
>> I am advocating for inclusion of __func__ everywhere...It's simpler for
>> remote support and bug chasing.

I meant everywhere in SoundWire. Other subsystems may have different
views and different observability tools, that's fine.

> A quick grep seems to suggest you're in a small minority here with some
> 5k of 65k dev_err() including __func__.
> 
> [ And there's only 55 out of 750 dev_err() like that in
> drivers/soundwire, which is inconsistent at best. ]

As you mentioned yourself, the asynchronous nature of the SoundWire
probe/attachment/interrupts makes it difficult to reverse-engineer, and
we want to err on the side of MORE information.

Also not all dev_err() are equal, most are part of paranoid checks and
never used. An example above is the sysfs log, we've never seen it happen.

That's different to changes that impact probe and interrupts which will
fail at some point on new platforms. It's not an academic statement,
I've spent most of my day chasing two such issues.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ