[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ed93216-9445-4b31-85cf-5b7f35ee91c2@foss.st.com>
Date: Thu, 27 Jun 2024 18:49:02 +0200
From: Fabrice Gasnier <fabrice.gasnier@...s.st.com>
To: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
CC: Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
Greg Kroah-Hartman
<gregkh@...uxfoundation.org>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
Alexandre Torgue <alexandre.torgue@...s.st.com>,
Neil Armstrong
<neil.armstrong@...aro.org>,
<linux-usb@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
Nikita Travkin <nikita@...n.ru>,
<linux-stm32@...md-mailman.stormreply.com>,
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [Linux-stm32] [PATCH v2 6/7] usb: typec: ucsi: extract common
code for command handling
On 6/27/24 17:58, Dmitry Baryshkov wrote:
> On Thu, 27 Jun 2024 at 18:51, Fabrice Gasnier
> <fabrice.gasnier@...s.st.com> wrote:
>>
>> On 6/25/24 18:49, Dmitry Baryshkov wrote:
>>> On Tue, Jun 25, 2024 at 05:24:54PM GMT, Fabrice Gasnier wrote:
>>>> On 6/21/24 00:55, Dmitry Baryshkov wrote:
>>>>> Extract common functions to handle command sending and to handle events
>>>>> from UCSI. This ensures that all UCSI glue drivers handle the ACKs in
>>>>> the same way.
>>>>>
>>>>> The CCG driver used DEV_CMD_PENDING both for internal
>>>>> firmware-related commands and for UCSI control handling. Leave the
>>>>> former use case intact.
>>>>>
>>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
>>>>> ---
>>>>> drivers/usb/typec/ucsi/ucsi.c | 43 +++++++++++++++++++++++++++
>>>>> drivers/usb/typec/ucsi/ucsi.h | 7 +++++
>>>>> drivers/usb/typec/ucsi/ucsi_acpi.c | 46 ++---------------------------
>>>>> drivers/usb/typec/ucsi/ucsi_ccg.c | 21 ++-----------
>>>>> drivers/usb/typec/ucsi/ucsi_glink.c | 47 ++---------------------------
>>>>> drivers/usb/typec/ucsi/ucsi_stm32g0.c | 44 ++--------------------------
>>>>> drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 52 ++-------------------------------
>>>>> 7 files changed, 62 insertions(+), 198 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
>>>>> index 4ba22323dbf9..691ee0c4ef87 100644
>>>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>>>> @@ -36,6 +36,48 @@
>>>>> */
>>>>> #define UCSI_SWAP_TIMEOUT_MS 5000
>>>>>
>>>>> +void ucsi_notify_common(struct ucsi *ucsi, u32 cci)
>>>>> +{
>>>>> + if (UCSI_CCI_CONNECTOR(cci))
>>>>> + ucsi_connector_change(ucsi, UCSI_CCI_CONNECTOR(cci));
>>>>> +
>>>>> + if (cci & UCSI_CCI_ACK_COMPLETE &&
>>>>> + test_bit(ACK_PENDING, &ucsi->flags))
>>>>> + complete(&ucsi->complete);
>>>>> +
>>>>> + if (cci & UCSI_CCI_COMMAND_COMPLETE &&
>>>>> + test_bit(COMMAND_PENDING, &ucsi->flags))
>>>>> + complete(&ucsi->complete);
>>>>
>>>> Hi Dmitry,
>>>>
>>>> I've recently faced some race with ucsi_stm32g0 driver, and have sent a
>>>> fix for it [1], as you've noticed in the cover letter.
>>>>
>>>> To fix that, I've used test_and_clear_bit() in above two cases, instead
>>>> of test_bit().
>>>
>>> Could you possible describe, why do you need test_and_clear_bit()
>>> instead of just test_bit()? The bits are cleared at the end of the
>>> .sync_write(), also there can be no other command (or ACK_CC) submission
>>> before this one is fully processed.
>>
>> Hi Dmitry,
>>
>> It took me some time to reproduce this race I observed earlier.
>> (I observe this during DR swap.)
>>
>> Once the ->async_control(UCSI_ACK_CC_CI) call bellow gets completed, and
>> before the ACK_PENDING bit gets cleared, e.g. clear_bit(ACK_PENDING), I
>> get an asynchronous interrupt.
>>
>> Basically, Then the above complete() gets called (due to
>> UCSI_CCI_ACK_COMPLETE & ACK_PENDING).
>>
>> Subsequent UCSI_GET_CONNECTOR_STATUS command (from
>> ucsi_handle_connector_change) will be unblocked immediately due to
>> complete() call has already happen, without UCSI_CCI_COMMAND_COMPLETE
>> cci flag, hence returning -EIO.
>
> But the ACK_CI is being sent as a response to a command. This means
> that the ppm_lock should be locked. The UCSI_GET_CONNECTOR_STATUS
> command should wait for ppm_lock to be freed and only then it can
> proceed with sending the command. Maybe I'm misunderstanding the case
> or maybe there is a loophole somewhere.
There's probably also some miss-understanding at my end, I don't get how
the ppm_lock would protext from non atomic async_control()/clear_bit().
Let me share some trace here, I hope it can better show the difference
at my end when I get the race.
I've added some debug prints around these lines, to trace the set/clear
of the COMMAND/ACK_PENDING, main commands and cci flags.
normal case is:
---
ucsi_send_command_common: UCSI_SET_UOR
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
set:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: ucsi_connector_change
ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE
set:ACK_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING
racy case:
---
ucsi_send_command_common: UCSI_SET_UOR
set:COMMAND_PENDING
ucsi_stm32g0_irq_handler
ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
clr:COMMAND_PENDING
ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
set:ACK_PENDING
ucsi_stm32g0_irq_handler <-- up to here nothing supicious
ucsi_notify_common: ucsi_connector_change
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
ucsi_stm32g0_irq_handler <-- 2nd IRQ before clr:ACK_PENDING
ucsi_notify_common: ucsi_connector_change
ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
clr:ACK_PENDING
ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
set:COMMAND_PENDING <-- completion already done :-(
clr:COMMAND_PENDING
ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-5)
I notice the two conditions are met above: both ucsi_connector_change()
and cci ack completed.
This happens before clear_bit(ACK_PENDING), which lead to anticipate
completion of the subsequent UCSI_GET_CONNECTOR_STATUS command.
So the test_and_clear_bit() would address a robustness case?
Please advise,
Best Regards,
Fabrice
>
>> This is where the test_and_clear_bit() atomic operation helps, to avoid
>> non atomic operation:
>>
>> -> async_control(UCSI_ACK_CC_CI)
>> new interrupt may occur here
>> -> clear_bit(ACK_PENDING)
>>
>>>
>>>>
>>>> https://lore.kernel.org/linux-usb/20240612124656.2305603-1-fabrice.gasnier@foss.st.com/
>>>>
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(ucsi_notify_common);
>>>>> +
>>>>> +int ucsi_sync_control_common(struct ucsi *ucsi, u64 command)
>>>>> +{
>>>>> + bool ack = UCSI_COMMAND(command) == UCSI_ACK_CC_CI;
>>>>> + int ret;
>>>>> +
>>>>> + if (ack)
>>>>> + set_bit(ACK_PENDING, &ucsi->flags);
>>>>> + else
>>>>> + set_bit(COMMAND_PENDING, &ucsi->flags);
>>>>> +
>>>>> + ret = ucsi->ops->async_control(ucsi, command);
>>>>> + if (ret)
>>>>> + goto out_clear_bit;
>>>>> +
>>>>> + if (!wait_for_completion_timeout(&ucsi->complete, 5 * HZ))
>>>>> + ret = -ETIMEDOUT;
>>>>
>>>> With test_and_clear_bit(), could return 0, in case of success here.
>>>
>>> Oh, I see. So your code returns earlier. I have a feeling that this
>>> approach is less logical and slightly harder to follow.
>>
>> By reading your proposal bellow, I'd agree with you.
>>>
>>> Maybe it's easier if it is implemented as:
>>>
>>> if (wait_for_completion_timeout(...))
>>> return 0;
>>
>> Yes, sounds good to me.
>>
>>>
>>> if (ack)
>>> clear_bit(ACK_PENDING)
>>> else
>>> clear_bit(COMMAND_PENDING)
>>>
>>> return -ETIMEDOUT;
>>>
>>>
>>> OR
>>>
>>> if (!wait_for_completion_timeout(...)) {
>>> if (ack)
>>> clear_bit(ACK_PENDING)
>>> else
>>> clear_bit(COMMAND_PENDING)
>>>
>>> return -ETIMEDOUT;
>>> }
>>
>> Both seems fine.
>>
>> Please advise,
>> BR,
>> Fabrice
>>
>>>
>>> return 0;
>>>
>>> But really, unless there is an actual issue with the current code, I'd
>>> prefer to keep it. It makes it clear that the bits are set and then are
>>> cleared properly.
>>>
>>>> I'd suggest to use similar approach here, unless you see some drawback?
>>>>
>>>> Best Regards,
>>>> Fabrice
>>>>
>>>>> +
>>>>> +out_clear_bit:
>>>>> + if (ack)
>>>>> + clear_bit(ACK_PENDING, &ucsi->flags);
>>>>> + else
>>>>> + clear_bit(COMMAND_PENDING, &ucsi->flags);
>>>>> +
>>>>> + return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(ucsi_sync_control_common);
>>>>> +
>>>>> static int ucsi_acknowledge(struct ucsi *ucsi, bool conn_ack)
>>>>> {
>>>>> u64 ctrl;
>>>>> @@ -1883,6 +1925,7 @@ struct ucsi *ucsi_create(struct device *dev, const struct ucsi_operations *ops)
>>>>> INIT_WORK(&ucsi->resume_work, ucsi_resume_work);
>>>>> INIT_DELAYED_WORK(&ucsi->work, ucsi_init_work);
>>>>> mutex_init(&ucsi->ppm_lock);
>>>>> + init_completion(&ucsi->complete);
>>>>> ucsi->dev = dev;
>>>>> ucsi->ops = ops;
>>>>
>>>> [snip]
>>>>
>>>>> ucsi->ucsi = ucsi_create(dev, &pmic_glink_ucsi_ops);
>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi_stm32g0.c b/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> index 14737ca3724c..d948c3f579e1 100644
>>>>> --- a/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> +++ b/drivers/usb/typec/ucsi/ucsi_stm32g0.c
>>>>> @@ -61,11 +61,7 @@ struct ucsi_stm32g0 {
>>>>
>>>> [snip]
>>>>
>>>>> -
>>>>> - ret = ucsi_stm32g0_async_control(ucsi, command);
>>>>> - if (ret)
>>>>> - goto out_clear_bit;
>>>>> -
>>>>> - if (!wait_for_completion_timeout(&g0->complete, msecs_to_jiffies(5000)))
>>>>> - ret = -ETIMEDOUT;
>>>>> - else
>>>>> - return 0;
>>>>> -
>>>>> -out_clear_bit:
>>>>> - if (ack)
>>>>> - clear_bit(ACK_PENDING, &g0->flags);
>>>>> - else
>>>>> - clear_bit(COMMAND_PENDING, &g0->flags);
>>>>> -
>>>>> - return ret;
>>>>> -}
>>>>> -
>>>>> static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data)
>>>>> {
>>>>> struct ucsi_stm32g0 *g0 = data;
>>>>> @@ -449,13 +416,7 @@ static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data)
>>>>> if (ret)
>>>>> return IRQ_NONE;
>>>>>
>>>>> - if (UCSI_CCI_CONNECTOR(cci))
>>>>> - ucsi_connector_change(g0->ucsi, UCSI_CCI_CONNECTOR(cci));
>>>>> -
>>>>> - if (cci & UCSI_CCI_ACK_COMPLETE && test_and_clear_bit(ACK_PENDING, &g0->flags))
>>>>> - complete(&g0->complete);
>>>>> - if (cci & UCSI_CCI_COMMAND_COMPLETE && test_and_clear_bit(COMMAND_PENDING, &g0->flags))
>>>>> - complete(&g0->complete);
>>>>> + ucsi_notify_common(g0->ucsi, cci);
>>>>
>>>> I can see the fix "test_and_clear_bit()" sent earlier is removed from here.
>>>>
>>>> I'd suggest to use similar approach as here, unless you see some drawback?
>>>>
>>>> Please advise,
>>>> Best Regards,
>>>> Fabrice
>>>
>
>
>
Powered by blists - more mailing lists