linux-kernel - Re: [PATCH v4 2/8] scpi: Add alternative legacy structures, functions and macros

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24e6ef19-b88d-a5b8-f514-d39a7b5725e8@baylibre.com>
Date:   Wed, 19 Oct 2016 12:28:33 +0200
From:   Neil Armstrong <narmstrong@...libre.com>
To:     Sudeep Holla <sudeep.holla@....com>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc:     linux-amlogic@...ts.infradead.org, khilman@...libre.com,
        heiko@...ech.de, wxt@...k-chips.com, frank.wang@...k-chips.com
Subject: Re: [PATCH v4 2/8] scpi: Add alternative legacy structures, functions
 and macros

On 10/17/2016 01:16 PM, Sudeep Holla wrote:
> 
> 
> On 17/10/16 09:25, Neil Armstrong wrote:
>> On 10/10/2016 04:36 PM, Sudeep Holla wrote:
>>> Hi Neil,
>>>
>>> Sorry, I could not reply to your response on v3. Anyways I will review v4.
>>>
>>> On 05/10/16 08:33, Neil Armstrong wrote:
>>>> This patch adds support for the Legacy SCPI protocol in early JUNO versions and
>>>> shipped Amlogic ARMv8 based SoCs. Some Rockchip SoC are also known to use this
>>>> version of protocol with extended vendor commands
>>>> .
>>>> In order to support the legacy SCPI protocol variant, add back the structures
>>>> and macros that varies against the final specification.
>>>> Then add indirection table for legacy commands.
>>>> Finally Add bitmap field for channel selection since the Legacy protocol mandates to
>>>> send a selected subset of the commands on the high priority channel instead of the
>>>> low priority channel.
>>>>
>>>> The message sending path differs from the final SCPI procotocol because the
>>>> Amlogic SCP firmware always reply 1 instead of a special value containing the command
>>>> byte and replied rx data length.
>>>> For this reason commands queuing cannot be used and we assume the reply command is
>>>> the head of the rx_pending list since we ensure sequential command sending with a
>>>> separate dedicated mutex.
>>>>
>>>> Signed-off-by: Neil Armstrong <narmstrong@...libre.com>
>>>> ---
>>>>  drivers/firmware/arm_scpi.c | 221 +++++++++++++++++++++++++++++++++++++++-----
>>>>  1 file changed, 199 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
>>>> index 498afa0..6244eb1 100644
>>>> --- a/drivers/firmware/arm_scpi.c
>>>> +++ b/drivers/firmware/arm_scpi.c
>>>
>>> [...]
>>>
>>>> @@ -307,21 +398,46 @@ static void scpi_process_cmd(struct scpi_chan *ch, u32 cmd)
>>>>          return;
>>>>      }
>>>>
>>>> -    list_for_each_entry(t, &ch->rx_pending, node)
>>>> -        if (CMD_XTRACT_UNIQ(t->cmd) == CMD_XTRACT_UNIQ(cmd)) {
>>>> -            list_del(&t->node);
>>>> -            match = t;
>>>> -            break;
>>>> -        }
>>>> +    /* Command type is not replied by the SCP Firmware in legacy Mode
>>>> +     * We should consider that command is the head of pending RX commands
>>>> +     * if the list is not empty. In TX only mode, the list would be empty.
>>>> +     */
>>>> +    if (scpi_info->is_legacy) {
>>>> +        match = list_first_entry(&ch->rx_pending, struct scpi_xfer,
>>>> +                     node);
>>>> +        list_del(&match->node);
>>>> +    } else {
>>>> +        list_for_each_entry(t, &ch->rx_pending, node)
>>>> +            if (CMD_XTRACT_UNIQ(t->cmd) == CMD_XTRACT_UNIQ(cmd)) {
>>>> +                list_del(&t->node);
>>>> +                match = t;
>>>> +                break;
>>>> +            }
>>>> +    }
>>>>      /* check if wait_for_completion is in progress or timed-out */
>>>>      if (match && !completion_done(&match->done)) {
>>>> -        struct scpi_shared_mem *mem = ch->rx_payload;
>>>> -        unsigned int len = min(match->rx_len, CMD_SIZE(cmd));
>>>> +        unsigned int len;
>>>> +
>>>> +        if (scpi_info->is_legacy) {
>>>> +            struct legacy_scpi_shared_mem *mem = ch->rx_payload;
>>>> +
>>>> +            /* RX Length is not replied by the lagcy Firmware */
> 
> Typo above legacy
> 
>>>> +            len = match->rx_len;
>>>> +
>>>> +            match->status = le32_to_cpu(mem->status);
>>>> +            memcpy_fromio(match->rx_buf, mem->payload, len);
>>>
>>> The above 2 seems common to both, no ?
>>
>> No, the shared_mem structure differs.
>>
> 
> Yes I see that, I was just referring the last 2 statements.
> 
>>>
>>>> +        } else {
>>>> +            struct scpi_shared_mem *mem = ch->rx_payload;
>>>> +
>>>> +            len = min(match->rx_len, CMD_SIZE(cmd));
>>>> +
>>>> +            match->status = le32_to_cpu(mem->status);
>>>> +            memcpy_fromio(match->rx_buf, mem->payload, len);
> 
> and the above 2 can be moved out of the conditions, no ?
> 
> if (scpi_info->is_legacy) {
>     struct legacy_scpi_shared_mem *mem = ch->rx_payload;
>     len = match->rx_len;
> } else {
>     struct scpi_shared_mem *mem = ch->rx_payload;
>     len = min(match->rx_len, CMD_SIZE(cmd));
> }
> match->status = le32_to_cpu(mem->status);
> memcpy_fromio(match->rx_buf, mem->payload, len);
> 
> should work.

Well, we will have "error: ‘mem’ undeclared (first use in this function)" since mem is not declared outside the if/else.

I don't see good solutions even with an union.

> 
> [...]
> 
>>>
>>>> +    else
>>>> +        cmd = le32_to_cpu(mem->command);
>>>>
>>>>      scpi_process_cmd(ch, cmd);
>>>>  }
>>>> @@ -343,17 +464,26 @@ static void scpi_tx_prepare(struct mbox_client *c, void *msg)
>>>>      struct scpi_chan *ch = container_of(c, struct scpi_chan, cl);
>>>>      struct scpi_shared_mem *mem = (struct scpi_shared_mem *)ch->tx_payload;
>>>>
>>>> -    if (t->tx_buf)
>>>> -        memcpy_toio(mem->payload, t->tx_buf, t->tx_len);
>>>> +    if (t->tx_buf) {
>>>> +        if (scpi_info->is_legacy)
>>>> +            memcpy_toio(ch->tx_payload, t->tx_buf, t->tx_len);
>>>> +        else
>>>> +            memcpy_toio(mem->payload, t->tx_buf, t->tx_len);
>>>> +    }
>>>> +
>>>>      if (t->rx_buf) {
>>>>          if (!(++ch->token))
>>>>              ++ch->token;
>>>>          ADD_SCPI_TOKEN(t->cmd, ch->token);
>>>> +        if (scpi_info->is_legacy)
>>>> +            t->slot = t->cmd;
>>>
>>> I thought passing token was not an issue from your previous response,
>>> but you are overriding it here, why ?
>>
>> Indeed, I can leave it, but it's useless since it won't serve to
>> distinguish multiple similar commands.
>>
> 
> OK, I don't see any point in such micro optimization, so please retain it.
> 

I misread my code, I leaved the token passing, but I copy back the cmd to the slot which is used by the MHU.
If I remove the "t->slot = t->cmd;", the token won't be passed to the FW.

> [...]
> 
>>>> +    /* Since we cannot distinguish the original command in the
>>>> +     * MHU reply stat value from a Legacy SCP firmware, ensure
>>>> +     * sequential command sending to the firmware.
>>>> +     */
>>>
>>> OK this comment now questions the existence of this extra lock.
>>> The mailbox will always send the commands in the sequential order.
>>> It's only firmware that can re-order the response. Since that can't
>>> happen in you case, I really don't see the need for this.
>>>
>>> Please explain the race you would see without this locking. Yes I
>>> understand that only one command is supposed to be sent to firmware at a
>>> time. Suppose you allow more callers here, all will wait on the
>>> completion flags and the first in the list gets unblocked right ?
>>> I am just trying to understand if there's real need for this extra
>>> lock when we already have that from the list.
>>
>> In my current tests I have huge kernel hang when having multiple callers,
>> I must find out where this issue comes from...
> 
> Yes IMO, you should understand the root cause of this issue. There may
> be issue with the existing driver itself. But just adding a lock just to
> avoid the hang without understanding it is wrong.
> 
>> In any case, we have an issue about the command sequencing. If we
>> push a tx-only command and then right after a tx-rx command, the
>> mailbox callback from the first command won't be able to distinguish
>> which command is handled !
> 
> Hmm, how exactly ? I won't expect scpi_handle_remote_msg to becalled in
> that case.
> 
>> In this case, the rx_pending list will not be empty, some garbage
>> will be returned to the second command handler and the real data from
>> the second command handling will be lost thinking it's a tx-only
>> command.
>>
> 
> Yes as I said why is scpi_handle_remote_msg called for tx only command.
> And more over we don't have any tx only command in the driver, I am
> still unable to understand the issue you are facing. Are you sure you
> have tx-only command in the failure/hang case ?
> 
>>
>> We have two choices here : - Also push the tx-only commands to the
>> rx_pending list, and also wait for their completion
> 
> See above, I need to know details on this tx-only command. In fact, they
> may not be tx-only as SCP is sending some response back, may just status.
> 
>> - Add an extra lock
>>
> 
> Not this for sure.
> 
>> What is your preferred scheme ?
>>
> 
> Option 1 if it legitimate case. I mean we may be misunderstanding the
> definition of tx-only command.
> 
> 
>>>> +    if (scpi_info->is_legacy)
>>>> +        mutex_lock(&scpi_chan->legacy_lock);
>>>> +
>>>>      ret = mbox_send_message(scpi_chan->chan, msg);
>>>>      if (ret < 0 || !rx_buf)
>>>>          goto out;
>>>> @@ -421,9 +567,13 @@ static int scpi_send_message(unsigned int offset, void *tx_buf,
>>>>          /* first status word */
>>>>          ret = msg->status;
>>>>  out:
>>>> -    if (ret < 0 && rx_buf) /* remove entry from the list if timed-out */
>>>> +    if (ret < 0 && rx_buf)
>>>> +        /* remove entry from the list if timed-out */
>>>>          scpi_process_cmd(scpi_chan, msg->cmd);
>>>>
>>>> +    if (scpi_info->is_legacy)
>>>> +        mutex_unlock(&scpi_chan->legacy_lock);
>>>> +
>>>>      put_scpi_xfer(msg, scpi_chan);
>>>>      /* SCPI error codes > 0, translate them to Linux scale*/
>>>>      return ret > 0 ? scpi_to_linux_errno(ret) : ret;
>>>
> 
> [...]
> 
>>
>> I will fix the issues, but I need your advice for the locking scheme. I really want this
>> to be merged and be able to go forward !
>>
> 
> Yes I agree and I have no major concern with the series now except the
> locking.
>