lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Jun 2024 18:49:02 +0200
From: Fabrice Gasnier <fabrice.gasnier@...s.st.com>
To: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
CC: Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
        Greg Kroah-Hartman
	<gregkh@...uxfoundation.org>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>,
        Alexandre Torgue <alexandre.torgue@...s.st.com>,
        Neil Armstrong
	<neil.armstrong@...aro.org>,
        <linux-usb@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Nikita Travkin <nikita@...n.ru>,
        <linux-stm32@...md-mailman.stormreply.com>,
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [Linux-stm32] [PATCH v2 6/7] usb: typec: ucsi: extract common
 code for command handling

On 6/27/24 17:58, Dmitry Baryshkov wrote:
> On Thu, 27 Jun 2024 at 18:51, Fabrice Gasnier
> <fabrice.gasnier@...s.st.com> wrote:
>>
>> On 6/25/24 18:49, Dmitry Baryshkov wrote:
>>> On Tue, Jun 25, 2024 at 05:24:54PM GMT, Fabrice Gasnier wrote:
>>>> On 6/21/24 00:55, Dmitry Baryshkov wrote:
>>>>> Extract common functions to handle command sending and to handle events
>>>>> from UCSI. This ensures that all UCSI glue drivers handle the ACKs in
>>>>> the same way.
>>>>>
>>>>> The CCG driver used DEV_CMD_PENDING both for internal
>>>>> firmware-related commands and for UCSI control handling. Leave the
>>>>> former use case intact.
>>>>>
>>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
>>>>> ---
>>>>>  drivers/usb/typec/ucsi/ucsi.c           | 43 +++++++++++++++++++++++++++
>>>>>  drivers/usb/typec/ucsi/ucsi.h           |  7 +++++
>>>>>  drivers/usb/typec/ucsi/ucsi_acpi.c      | 46 ++---------------------------
>>>>>  drivers/usb/typec/ucsi/ucsi_ccg.c       | 21 ++-----------
>>>>>  drivers/usb/typec/ucsi/ucsi_glink.c     | 47 ++---------------------------
>>>>>  drivers/usb/typec/ucsi/ucsi_stm32g0.c   | 44 ++--------------------------
>>>>>  drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 52 ++-------------------------------
>>>>>  7 files changed, 62 insertions(+), 198 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
>>>>> index 4ba22323dbf9..691ee0c4ef87 100644
>>>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>>>> @@ -36,6 +36,48 @@
>>>>>   */
>>>>>  #define UCSI_SWAP_TIMEOUT_MS       5000
>>>>>
>>>>> +void ucsi_notify_common(struct ucsi *ucsi, u32 cci)
>>>>> +{
>>>>> +   if (UCSI_CCI_CONNECTOR(cci))
>>>>> +           ucsi_connector_change(ucsi, UCSI_CCI_CONNECTOR(cci));
>>>>> +
>>>>> +   if (cci & UCSI_CCI_ACK_COMPLETE &&
>>>>> +       test_bit(ACK_PENDING, &ucsi->flags))
>>>>> +           complete(&ucsi->complete);
>>>>> +
>>>>> +   if (cci & UCSI_CCI_COMMAND_COMPLETE &&
>>>>> +       test_bit(COMMAND_PENDING, &ucsi->flags))
>>>>> +           complete(&ucsi->complete);
>>>>
>>>> Hi Dmitry,
>>>>
>>>> I've recently faced some race with ucsi_stm32g0 driver, and have sent a
>>>> fix for it [1], as you've noticed in the cover letter.
>>>>
>>>> To fix that, I've used test_and_clear_bit() in above two cases, instead
>>>> of test_bit().
>>>
>>> Could you possible describe, why do you need test_and_clear_bit()
>>> instead of just test_bit()? The bits are cleared at the end of the
>>> .sync_write(), also there can be no other command (or ACK_CC) submission
>>> before this one is fully processed.
>>
>> Hi Dmitry,
>>
>> It took me some time to reproduce this race I observed earlier.
>> (I observe this during DR swap.)
>>
>> Once the ->async_control(UCSI_ACK_CC_CI) call bellow gets completed, and
>> before the ACK_PENDING bit gets cleared, e.g. clear_bit(ACK_PENDING), I
>> get an asynchronous interrupt.
>>
>> Basically, Then the above complete() gets called (due to
>> UCSI_CCI_ACK_COMPLETE & ACK_PENDING).
>>
>> Subsequent UCSI_GET_CONNECTOR_STATUS command (from
>> ucsi_handle_connector_change) will be unblocked immediately due to
>> complete() call has already happen, without UCSI_CCI_COMMAND_COMPLETE
>> cci flag, hence returning -EIO.
> 
> But the ACK_CI is being sent as a response to a command. This means
> that the ppm_lock should be locked. The UCSI_GET_CONNECTOR_STATUS
> command should wait for ppm_lock to be freed and only then it can
> proceed with sending the command. Maybe I'm misunderstanding the case
> or maybe there is a loophole somewhere.

There's probably also some miss-understanding at my end, I don't get how
the ppm_lock would protext from non atomic async_control()/clear_bit().

Let me share some trace here, I hope it can better show the difference
at my end when I get the race.
I've added some debug prints around these lines, to trace the set/clear
of the COMMAND/ACK_PENDING, main commands and cci flags.

normal case is:
---
ucsi_send_command_common: UCSI_SET_UOR
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
set:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: ucsi_connector_change
ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE
set:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING

racy case:
---
ucsi_send_command_common: UCSI_SET_UOR
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
set:ACK_PENDING
ucsi_stm32g0_irq_handler              <-- up to here nothing supicious
ucsi_notify_common: ucsi_connector_change
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
ucsi_stm32g0_irq_handler              <-- 2nd IRQ before clr:ACK_PENDING
ucsi_notify_common: ucsi_connector_change
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING
ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
set:COMMAND_PENDING                   <-- completion already done :-(
clr:COMMAND_PENDING
ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-5)

I notice the two conditions are met above: both ucsi_connector_change()
and cci ack completed.

This happens before clear_bit(ACK_PENDING), which lead to anticipate
completion of the subsequent UCSI_GET_CONNECTOR_STATUS command.

So the test_and_clear_bit() would address a robustness case?

Please advise,
Best Regards,
Fabrice

> 
>> This is where the test_and_clear_bit() atomic operation helps, to avoid
>> non atomic operation:
>>
>> -> async_control(UCSI_ACK_CC_CI)
>> new interrupt may occur here
>> -> clear_bit(ACK_PENDING)
>>
>>>
>>>>
>>>> https://lore.kernel.org/linux-usb/20240612124656.2305603-1-fabrice.gasnier@foss.st.com/
>>>>
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(ucsi_notify_common);
>>>>> +
>>>>> +int ucsi_sync_control_common(struct ucsi *ucsi, u64 command)
>>>>> +{
>>>>> +   bool ack = UCSI_COMMAND(command) == UCSI_ACK_CC_CI;
>>>>> +   int ret;
>>>>> +
>>>>> +   if (ack)
>>>>> +           set_bit(ACK_PENDING, &ucsi->flags);
>>>>> +   else
>>>>> +           set_bit(COMMAND_PENDING, &ucsi->flags);
>>>>> +
>>>>> +   ret = ucsi->ops->async_control(ucsi, command);
>>>>> +   if (ret)
>>>>> +           goto out_clear_bit;
>>>>> +
>>>>> +   if (!wait_for_completion_timeout(&ucsi->complete, 5 * HZ))
>>>>> +           ret = -ETIMEDOUT;
>>>>
>>>> With test_and_clear_bit(), could return 0, in case of success here.
>>>
>>> Oh, I see. So your code returns earlier. I have a feeling that this
>>> approach is less logical and slightly harder to follow.
>>
>> By reading your proposal bellow, I'd agree with you.
>>>
>>> Maybe it's easier if it is implemented as:
>>>
>>> if (wait_for_completion_timeout(...))
>>>       return 0;
>>
>> Yes, sounds good to me.
>>
>>>
>>> if (ack)
>>>       clear_bit(ACK_PENDING)
>>> else
>>>       clear_bit(COMMAND_PENDING)
>>>
>>> return -ETIMEDOUT;
>>>
>>>
>>> OR
>>>
>>> if (!wait_for_completion_timeout(...)) {
>>>       if (ack)
>>>               clear_bit(ACK_PENDING)
>>>       else
>>>               clear_bit(COMMAND_PENDING)
>>>
>>>       return -ETIMEDOUT;
>>> }
>>
>> Both seems fine.
>>
>> Please advise,
>> BR,
>> Fabrice
>>
>>>
>>> return 0;
>>>
>>> But really, unless there is an actual issue with the current code, I'd
>>> prefer to keep it. It makes it clear that the bits are set and then are
>>> cleared properly.
>>>
>>>> I'd suggest to use similar approach here, unless you see some drawback?
>>>>
>>>> Best Regards,
>>>> Fabrice
>>>>
>>>>> +
>>>>> +out_clear_bit:
>>>>> +   if (ack)
>>>>> +           clear_bit(ACK_PENDING, &ucsi->flags);
>>>>> +   else
>>>>> +           clear_bit(COMMAND_PENDING, &ucsi->flags);
>>>>> +
>>>>> +   return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(ucsi_sync_control_common);
>>>>> +
>>>>>  static int ucsi_acknowledge(struct ucsi *ucsi, bool conn_ack)
>>>>>  {
>>>>>     u64 ctrl;
>>>>> @@ -1883,6 +1925,7 @@ struct ucsi *ucsi_create(struct device *dev, const struct ucsi_operations *ops)
>>>>>     INIT_WORK(&ucsi->resume_work, ucsi_resume_work);
>>>>>     INIT_DELAYED_WORK(&ucsi->work, ucsi_init_work);
>>>>>     mutex_init(&ucsi->ppm_lock);
>>>>> +   init_completion(&ucsi->complete);
>>>>>     ucsi->dev = dev;
>>>>>     ucsi->ops = ops;
>>>>
>>>> [snip]
>>>>
>>>>>     ucsi->ucsi = ucsi_create(dev, &pmic_glink_ucsi_ops);
>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi_stm32g0.c b/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> index 14737ca3724c..d948c3f579e1 100644
>>>>> --- a/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> +++ b/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> @@ -61,11 +61,7 @@ struct ucsi_stm32g0 {
>>>>
>>>> [snip]
>>>>
>>>>> -
>>>>> -   ret = ucsi_stm32g0_async_control(ucsi, command);
>>>>> -   if (ret)
>>>>> -           goto out_clear_bit;
>>>>> -
>>>>> -   if (!wait_for_completion_timeout(&g0->complete, msecs_to_jiffies(5000)))
>>>>> -           ret = -ETIMEDOUT;
>>>>> -   else
>>>>> -           return 0;
>>>>> -
>>>>> -out_clear_bit:
>>>>> -   if (ack)
>>>>> -           clear_bit(ACK_PENDING, &g0->flags);
>>>>> -   else
>>>>> -           clear_bit(COMMAND_PENDING, &g0->flags);
>>>>> -
>>>>> -   return ret;
>>>>> -}
>>>>> -
>>>>>  static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data)
>>>>>  {
>>>>>     struct ucsi_stm32g0 *g0 = data;
>>>>> @@ -449,13 +416,7 @@ static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data)
>>>>>     if (ret)
>>>>>             return IRQ_NONE;
>>>>>
>>>>> -   if (UCSI_CCI_CONNECTOR(cci))
>>>>> -           ucsi_connector_change(g0->ucsi, UCSI_CCI_CONNECTOR(cci));
>>>>> -
>>>>> -   if (cci & UCSI_CCI_ACK_COMPLETE && test_and_clear_bit(ACK_PENDING, &g0->flags))
>>>>> -           complete(&g0->complete);
>>>>> -   if (cci & UCSI_CCI_COMMAND_COMPLETE && test_and_clear_bit(COMMAND_PENDING, &g0->flags))
>>>>> -           complete(&g0->complete);
>>>>> +   ucsi_notify_common(g0->ucsi, cci);
>>>>
>>>> I can see the fix "test_and_clear_bit()" sent earlier is removed from here.
>>>>
>>>> I'd suggest to use similar approach as here, unless you see some drawback?
>>>>
>>>> Please advise,
>>>> Best Regards,
>>>> Fabrice
>>>
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ