lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 1 Apr 2021 17:05:01 -0500
From:   Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     Vinod Koul <vkoul@...nel.org>,
        Bard Liao <yung-chuan.liao@...ux.intel.com>,
        alsa-devel@...a-project.org, linux-kernel@...r.kernel.org,
        hui.wang@...onical.com, sanyog.r.kale@...el.com,
        rander.wang@...ux.intel.com, bard.liao@...el.com
Subject: Re: [PATCH 1/2] soundwire: add macro to selectively change error
 levels



On 4/1/21 3:56 PM, Greg KH wrote:
> On Thu, Apr 01, 2021 at 01:43:53PM -0500, Pierre-Louis Bossart wrote:
>>
>>>>> My bigger issue with this is that this macro is crazy.  Why do you need
>>>>> debugging here at all for this type of thing?  That's what ftrace is
>>>>> for, do not sprinkle code with "we got this return value from here!" all
>>>>> over the place like what this does.
>>>>
>>>> We are not sprinkling the code all over the place with any new logs, they
>>>> exist already in the SoundWire code and this patch helps filter them out.
>>>> See e.g. patch 2/2
>>>>
>>>> -			dev_err(&slave->dev,
>>>> -				"Clk Stop type =%d failed: %d\n", type, ret);
>>>> +			sdw_dev_dbg_or_err(&slave->dev, ret != -ENODATA,
>>>> +					   "Clk Stop mode %d type =%d failed: %d\n",
>>>> +					   mode, type, ret);
>>>
>>> You just added a debug log for no reason.
>>
>> The number of logs is lower when dynamic debug is not enabled, and equal
>> when it is. there's no addition.
>>
>> The previous behavior was unconditional dev_err that everyone sees.
>>
>> Now it's dev_err ONLY when the code is NOT -ENODATA, and dev_dgb otherwise,
>> meaning it will seen ONLY be seen IF dynamic debug is enabled for
>> drivers/soundwire/bus.c
>>
>> Allow me to use another example from patch2:
>>
>> -		if (ret == -ENODATA)
>> -			dev_dbg(bus->dev,
>> -				"ClockStopNow Broadcast msg ignored %d", ret);
>> -		else
>> -			dev_err(bus->dev,
>> -				"ClockStopNow Broadcast msg failed %d", ret);
>> +		sdw_dev_dbg_or_err(bus->dev, ret != -ENODATA,
>> +				   "ClockStopNow Broadcast msg failed %d\n", ret);
>>
>> There's no new log, is there?
> 
> No, but that is not what you showed above which was just an error
> message being replaced with both a debug and an error message.

either debug or error message, not both.

> Just drop the debug messages, they are pointless, right?

That's the primary debug tool used with our friends at RedHat and 
Canonical, and that includes remote debug where we don't have access to 
the plaforms. We also have quite a few Bugzilla or github reports from 
community users who can provide the logs of alsa-info and dmesg, but 
that's about it. Those debug messages is what we get as feedback and 
test reports, so we absolutely need them to be 'to the point'.

Maybe to reassure you on the scope of the changes I am suggesting here, 
there is a total of *13* occurrences of dev_dbg() in the SoundWire bus 
code, and they were added in very specific branches where something goes 
boink to help folks like Bard and I figure out what sequence led to the 
problem. I think it's the same on Qualcomm platforms.

In these examples related to the clock stop/restart, a message will be 
generated during pm_runtime suspend/resume sequences and only when 
unexpected behavior is detected, so the total bandwidth used by these 
messages is minimal. It has to be that way, we are currently debugging 
cases where we see those odd behaviors after thousands of suspend/resume 
cycles, the last thing we want is to be swamped with "pointless" 
messages. It's not at all like we are reporting "hello, i have this 
error code", it's rather "this error code should not happen in this 
sequence". in 99% of the cases, the error code is actually not very 
useful, it's where the error occurs that is priceless for debug.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ