[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6a475ed4-8ca5-0422-e416-2a32f1afa8ef@linux.intel.com>
Date: Tue, 5 Dec 2017 08:48:03 -0600
From: Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
To: Vinod Koul <vinod.koul@...el.com>
Cc: ALSA <alsa-devel@...a-project.org>,
Charles Keepax <ckeepax@...nsource.cirrus.com>,
Takashi <tiwai@...e.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
plai@...eaurora.org, LKML <linux-kernel@...r.kernel.org>,
Sagar Dharia <sdharia@...eaurora.org>, patches.audio@...el.com,
Mark <broonie@...nel.org>, srinivas.kandagatla@...aro.org,
Sudheer Papothi <spapothi@...eaurora.org>, alan@...ux.intel.com
Subject: Re: [alsa-devel] [PATCH v4 06/15] soundwire: Add IO transfer
On 12/5/17 7:43 AM, Pierre-Louis Bossart wrote:
> On 12/5/17 12:31 AM, Vinod Koul wrote:
>> On Sun, Dec 03, 2017 at 09:01:41PM -0600, Pierre-Louis Bossart wrote:
>>> On 12/3/17 11:04 AM, Vinod Koul wrote:
>>>> On Fri, Dec 01, 2017 at 05:27:31PM -0600, Pierre-Louis Bossart wrote:
>>
>> Sorry looks like I missed replying to this one earlier.
>>
>>>>>> +static inline int find_response_code(enum sdw_command_response resp)
>>>>>> +{
>>>>>> + switch (resp) {
>>>>>> + case SDW_CMD_OK:
>>>>>> + return 0;
>>>>>> +
>>>>>> + case SDW_CMD_IGNORED:
>>>>>> + return -ENODATA;
>>>>>> +
>>>>>> + case SDW_CMD_TIMEOUT:
>>>>>> + return -ETIMEDOUT;
>>>>>> +
>>>>>> + default:
>>>>>> + return -EIO;
>>>>>
>>>>> the 'default' case will handle both SDW_CMD_FAIL (which is a bus event
>>>>> usually due to bus clash or parity issues) and SDW_CMD_FAIL_OTHER
>>>>> (which is
>>>>> an imp-def IP event).
>>>>>
>>>>> Do they really belong in the same basket? From a debug perspective
>>>>> there is
>>>>> quite a bit of information lost.
>>>>
>>>> at higher level the error handling is same. the information is not
>>>> lost as
>>>> it is expected that you would log it at error source.
>>>
>>> I don't understand this. It's certainly not the same for me if you
>>> detect an
>>> electric problem or if the IP is in the weeds. Logging at the source
>>> is fine
>>> but this filtering prevents higher levels from doing anything different.
>>
>> The point is higher levels like here cant do much than bail out and
>> complain.
>>
>> Can you point out what would be different behaviour in each of these
>> cases?
>>
>>>>>> +static inline int do_transfer(struct sdw_bus *bus, struct sdw_msg
>>>>>> *msg)
>>>>>> +{
>>>>>> + int retry = bus->prop.err_threshold;
>>>>>> + enum sdw_command_response resp;
>>>>>> + int ret = 0, i;
>>>>>> +
>>>>>> + for (i = 0; i <= retry; i++) {
>>>>>> + resp = bus->ops->xfer_msg(bus, msg);
>>>>>> + ret = find_response_code(resp);
>>>>>> +
>>>>>> + /* if cmd is ok or ignored return */
>>>>>> + if (ret == 0 || ret == -ENODATA)
>>>>>
>>>>> Can you document why you don't retry on a CMD_IGNORED? I know there
>>>>> was a
>>>>> reason, I just can't remember it.
>>>>
>>>> CMD_IGNORED can be okay on broadcast. User of this API can retry all
>>>> they
>>>> want!
>>>
>>> So you retry if this is a CMD_FAILED but let higher levels retry for
>>> CMD_IGNORED, sorry I don't see the logic.
>>
>> Yes that is right.
>>
>> If I am doing a broadcast read, lets say for Device Id registers, why
>> in the
>> world would I want to retry? CMD_IGNORED is a valid response and
>> required to
>> stop enumeration cycle in that case.
>>
>> But if I am not expecting a CMD_IGNORED response, I can very well go
>> ahead
>> and retry from caller. The context is with caller and they can choose
>> to do
>> appropriate handling.
>>
>> And I have clarified this couple of times to you already, not sure how
>> many
>> more times I would have to do that.
>
> Until you clarify what you are doing.
> There is ONE case where IGNORED is a valid answer (reading the Prepare
> not finished bits), and it should not only be documented but analyzed in
> more details.
I meant Read SCP_DevID registers from Device0... prepare bits should
never return a CMD_IGNORED
> For a write an IGNORED is never OK.
>
>>
>>>>> Now that I think of it, the retry on TIMEOUT makes no sense to me.
>>>>> The retry
>>>>> was intended for bus-level issues, where maybe a single bit error
>>>>> causes an
>>>>> issue without consequences, but the TIMEOUT is a completely
>>>>> different beast,
>>>>> it's the master IP that doesn't answer really, a completely
>>>>> different case.
>>>>
>>>> well in those cases where you have blue wires, it actually helps :)
>>>
>>> Blue wires are not supposed to change electrical behavior. TIMEOUT is
>>> only
>>> an internal SOC level issue, so no I don't get how this helps.
>>>
>>> You have a retry count that is provided in the BIOS/firmware through
>>> disco
>>> properties and it's meant to bus errors. You are abusing the
>>> definitions. A
>>> command failed is supposed to be detected at the frame rate, which is
>>> typically 20us. a timeout is likely a 100s of ms value, so if you
>>> retry on
>>> top it's going to lock up the bus.
>>
>> The world is not perfect! A guy debugging setups needs all the help. I do
>> not see any reason for not to retry. Bus is anyway locked up while a
>> transfer is ongoing (we serialize transfers).
>>
>> Now if you feel this should be abhorred, I can change this for timeout.
>
> This TIMEOUT thing is your own definition, it's not part of the spec, so
> I don't see how it can be lumped together with spec-related parts.
>
> It's fine to keep a retry but please document what the expectations are
> for the TIMEOUT case.
>
>>
>>>>>> +enum sdw_command_response {
>>>>>> + SDW_CMD_OK = 0,
>>>>>> + SDW_CMD_IGNORED = 1,
>>>>>> + SDW_CMD_FAIL = 2,
>>>>>> + SDW_CMD_TIMEOUT = 4,
>>>>>> + SDW_CMD_FAIL_OTHER = 8,
>>>>>
>>>>> Humm, I can't recall if/why this is a mask? does it need to be?
>>>>
>>>> mask, not following!
>>>>
>>>> Taking a wild guess that you are asking about last error, which is
>>>> for SW
>>>> errors like malloc fail etc...
>>>
>>> no, I was asking why this is declared as if it was used for a
>>> bitmask, why
>>> not 0,1,2,3,4?
>>
>> Oh okay, I think it was something to do with bits for errors, but don
>> see it
>> helping so I can change it either way...
>
> Unless you use bit-wise operators and combined responses there is no
> reason to keep the current definitions.
>
> _______________________________________________
> Alsa-devel mailing list
> Alsa-devel@...a-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Powered by blists - more mailing lists