lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Dec 2017 08:48:03 -0600
From:   Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To:     Vinod Koul <vinod.koul@...el.com>
Cc:     ALSA <alsa-devel@...a-project.org>,
        Charles Keepax <ckeepax@...nsource.cirrus.com>,
        Takashi <tiwai@...e.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        plai@...eaurora.org, LKML <linux-kernel@...r.kernel.org>,
        Sagar Dharia <sdharia@...eaurora.org>, patches.audio@...el.com,
        Mark <broonie@...nel.org>, srinivas.kandagatla@...aro.org,
        Sudheer Papothi <spapothi@...eaurora.org>, alan@...ux.intel.com
Subject: Re: [alsa-devel] [PATCH v4 06/15] soundwire: Add IO transfer

On 12/5/17 7:43 AM, Pierre-Louis Bossart wrote:
> On 12/5/17 12:31 AM, Vinod Koul wrote:
>> On Sun, Dec 03, 2017 at 09:01:41PM -0600, Pierre-Louis Bossart wrote:
>>> On 12/3/17 11:04 AM, Vinod Koul wrote:
>>>> On Fri, Dec 01, 2017 at 05:27:31PM -0600, Pierre-Louis Bossart wrote:
>>
>> Sorry looks like I missed replying to this one earlier.
>>
>>>>>> +static inline int find_response_code(enum sdw_command_response resp)
>>>>>> +{
>>>>>> +    switch (resp) {
>>>>>> +    case SDW_CMD_OK:
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    case SDW_CMD_IGNORED:
>>>>>> +        return -ENODATA;
>>>>>> +
>>>>>> +    case SDW_CMD_TIMEOUT:
>>>>>> +        return -ETIMEDOUT;
>>>>>> +
>>>>>> +    default:
>>>>>> +        return -EIO;
>>>>>
>>>>> the 'default' case will handle both SDW_CMD_FAIL (which is a bus event
>>>>> usually due to bus clash or parity issues) and SDW_CMD_FAIL_OTHER 
>>>>> (which is
>>>>> an imp-def IP event).
>>>>>
>>>>> Do they really belong in the same basket? From a debug perspective 
>>>>> there is
>>>>> quite a bit of information lost.
>>>>
>>>> at higher level the error handling is same. the information is not 
>>>> lost as
>>>> it is expected that you would log it at error source.
>>>
>>> I don't understand this. It's certainly not the same for me if you 
>>> detect an
>>> electric problem or if the IP is in the weeds. Logging at the source 
>>> is fine
>>> but this filtering prevents higher levels from doing anything different.
>>
>> The point is higher levels like here cant do much than bail out and 
>> complain.
>>
>> Can you point out what would be different behaviour in each of these 
>> cases?
>>
>>>>>> +static inline int do_transfer(struct sdw_bus *bus, struct sdw_msg 
>>>>>> *msg)
>>>>>> +{
>>>>>> +    int retry = bus->prop.err_threshold;
>>>>>> +    enum sdw_command_response resp;
>>>>>> +    int ret = 0, i;
>>>>>> +
>>>>>> +    for (i = 0; i <= retry; i++) {
>>>>>> +        resp = bus->ops->xfer_msg(bus, msg);
>>>>>> +        ret = find_response_code(resp);
>>>>>> +
>>>>>> +        /* if cmd is ok or ignored return */
>>>>>> +        if (ret == 0 || ret == -ENODATA)
>>>>>
>>>>> Can you document why you don't retry on a CMD_IGNORED? I know there 
>>>>> was a
>>>>> reason, I just can't remember it.
>>>>
>>>> CMD_IGNORED can be okay on broadcast. User of this API can retry all 
>>>> they
>>>> want!
>>>
>>> So you retry if this is a CMD_FAILED but let higher levels retry for
>>> CMD_IGNORED, sorry I don't see the logic.
>>
>> Yes that is right.
>>
>> If I am doing a broadcast read, lets say for Device Id registers, why 
>> in the
>> world would I want to retry? CMD_IGNORED is a valid response and 
>> required to
>> stop enumeration cycle in that case.
>>
>> But if I am not expecting a CMD_IGNORED response, I can very well go 
>> ahead
>> and retry from caller. The context is with caller and they can choose 
>> to do
>> appropriate handling.
>>
>> And I have clarified this couple of times to you already, not sure how 
>> many
>> more times I would have to do that.
> 
> Until you clarify what you are doing.
> There is ONE case where IGNORED is a valid answer (reading the Prepare 
> not finished bits), and it should not only be documented but analyzed in 
> more details.
I meant Read SCP_DevID registers from Device0... prepare bits should 
never return a CMD_IGNORED

> For a write an IGNORED is never OK.
> 
>>
>>>>> Now that I think of it, the retry on TIMEOUT makes no sense to me. 
>>>>> The retry
>>>>> was intended for bus-level issues, where maybe a single bit error 
>>>>> causes an
>>>>> issue without consequences, but the TIMEOUT is a completely 
>>>>> different beast,
>>>>> it's the master IP that doesn't answer really, a completely 
>>>>> different case.
>>>>
>>>> well in those cases where you have blue wires, it actually helps :)
>>>
>>> Blue wires are not supposed to change electrical behavior. TIMEOUT is 
>>> only
>>> an internal SOC level issue, so no I don't get how this helps.
>>>
>>> You have a retry count that is provided in the BIOS/firmware through 
>>> disco
>>> properties and it's meant to bus errors. You are abusing the 
>>> definitions. A
>>> command failed is supposed to be detected at the frame rate, which is
>>> typically 20us. a timeout is likely a 100s of ms value, so if you 
>>> retry on
>>> top it's going to lock up the bus.
>>
>> The world is not perfect! A guy debugging setups needs all the help. I do
>> not see any reason for not to retry. Bus is anyway locked up while a
>> transfer is ongoing (we serialize transfers).
>>
>> Now if you feel this should be abhorred, I can change this for timeout.
> 
> This TIMEOUT thing is your own definition, it's not part of the spec, so 
> I don't see how it can be lumped together with spec-related parts.
> 
> It's fine to keep a retry but please document what the expectations are 
> for the TIMEOUT case.
> 
>>
>>>>>> +enum sdw_command_response {
>>>>>> +    SDW_CMD_OK = 0,
>>>>>> +    SDW_CMD_IGNORED = 1,
>>>>>> +    SDW_CMD_FAIL = 2,
>>>>>> +    SDW_CMD_TIMEOUT = 4,
>>>>>> +    SDW_CMD_FAIL_OTHER = 8,
>>>>>
>>>>> Humm, I can't recall if/why this is a mask? does it need to be?
>>>>
>>>> mask, not following!
>>>>
>>>> Taking a wild guess that you are asking about last error, which is 
>>>> for SW
>>>> errors like malloc fail etc...
>>>
>>> no, I was asking why this is declared as if it was used for a 
>>> bitmask, why
>>> not 0,1,2,3,4?
>>
>> Oh okay, I think it was something to do with bits for errors, but don 
>> see it
>> helping so I can change it either way...
> 
> Unless you use bit-wise operators and combined responses there is no 
> reason to keep the current definitions.
> 
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@...a-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

Powered by blists - more mailing lists