lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 28 Jun 2024 11:27:42 +0200
From: Fabrice Gasnier <fabrice.gasnier@...s.st.com>
To: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
CC: Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
        Greg Kroah-Hartman
	<gregkh@...uxfoundation.org>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>,
        Alexandre Torgue <alexandre.torgue@...s.st.com>,
        Neil Armstrong
	<neil.armstrong@...aro.org>,
        <linux-usb@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Nikita Travkin <nikita@...n.ru>,
        <linux-stm32@...md-mailman.stormreply.com>,
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [Linux-stm32] [PATCH v2 6/7] usb: typec: ucsi: extract common
 code for command handling

On 6/27/24 22:14, Dmitry Baryshkov wrote:
> On Thu, Jun 27, 2024 at 06:49:02PM GMT, Fabrice Gasnier wrote:
>> On 6/27/24 17:58, Dmitry Baryshkov wrote:
>>> On Thu, 27 Jun 2024 at 18:51, Fabrice Gasnier
>>> <fabrice.gasnier@...s.st.com> wrote:
>>>>
>>>> On 6/25/24 18:49, Dmitry Baryshkov wrote:
>>>>> On Tue, Jun 25, 2024 at 05:24:54PM GMT, Fabrice Gasnier wrote:
>>>>>> On 6/21/24 00:55, Dmitry Baryshkov wrote:
>>>>>>> Extract common functions to handle command sending and to handle events
>>>>>>> from UCSI. This ensures that all UCSI glue drivers handle the ACKs in
>>>>>>> the same way.
>>>>>>>
>>>>>>> The CCG driver used DEV_CMD_PENDING both for internal
>>>>>>> firmware-related commands and for UCSI control handling. Leave the
>>>>>>> former use case intact.
>>>>>>>
>>>>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
>>>>>>> ---
>>>>>>>  drivers/usb/typec/ucsi/ucsi.c           | 43 +++++++++++++++++++++++++++
>>>>>>>  drivers/usb/typec/ucsi/ucsi.h           |  7 +++++
>>>>>>>  drivers/usb/typec/ucsi/ucsi_acpi.c      | 46 ++---------------------------
>>>>>>>  drivers/usb/typec/ucsi/ucsi_ccg.c       | 21 ++-----------
>>>>>>>  drivers/usb/typec/ucsi/ucsi_glink.c     | 47 ++---------------------------
>>>>>>>  drivers/usb/typec/ucsi/ucsi_stm32g0.c   | 44 ++--------------------------
>>>>>>>  drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 52 ++-------------------------------
>>>>>>>  7 files changed, 62 insertions(+), 198 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
>>>>>>> index 4ba22323dbf9..691ee0c4ef87 100644
>>>>>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>>>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>>>>>> @@ -36,6 +36,48 @@
>>>>>>>   */
>>>>>>>  #define UCSI_SWAP_TIMEOUT_MS       5000
>>>>>>>
>>>>>>> +void ucsi_notify_common(struct ucsi *ucsi, u32 cci)
>>>>>>> +{
>>>>>>> +   if (UCSI_CCI_CONNECTOR(cci))
>>>>>>> +           ucsi_connector_change(ucsi, UCSI_CCI_CONNECTOR(cci));
>>>>>>> +
>>>>>>> +   if (cci & UCSI_CCI_ACK_COMPLETE &&
>>>>>>> +       test_bit(ACK_PENDING, &ucsi->flags))
>>>>>>> +           complete(&ucsi->complete);
>>>>>>> +
>>>>>>> +   if (cci & UCSI_CCI_COMMAND_COMPLETE &&
>>>>>>> +       test_bit(COMMAND_PENDING, &ucsi->flags))
>>>>>>> +           complete(&ucsi->complete);
>>>>>>
>>>>>> Hi Dmitry,
>>>>>>
>>>>>> I've recently faced some race with ucsi_stm32g0 driver, and have sent a
>>>>>> fix for it [1], as you've noticed in the cover letter.
>>>>>>
>>>>>> To fix that, I've used test_and_clear_bit() in above two cases, instead
>>>>>> of test_bit().
>>>>>
>>>>> Could you possible describe, why do you need test_and_clear_bit()
>>>>> instead of just test_bit()? The bits are cleared at the end of the
>>>>> .sync_write(), also there can be no other command (or ACK_CC) submission
>>>>> before this one is fully processed.
>>>>
>>>> Hi Dmitry,
>>>>
>>>> It took me some time to reproduce this race I observed earlier.
>>>> (I observe this during DR swap.)
>>>>
>>>> Once the ->async_control(UCSI_ACK_CC_CI) call bellow gets completed, and
>>>> before the ACK_PENDING bit gets cleared, e.g. clear_bit(ACK_PENDING), I
>>>> get an asynchronous interrupt.
>>>>
>>>> Basically, Then the above complete() gets called (due to
>>>> UCSI_CCI_ACK_COMPLETE & ACK_PENDING).
>>>>
>>>> Subsequent UCSI_GET_CONNECTOR_STATUS command (from
>>>> ucsi_handle_connector_change) will be unblocked immediately due to
>>>> complete() call has already happen, without UCSI_CCI_COMMAND_COMPLETE
>>>> cci flag, hence returning -EIO.
>>>
>>> But the ACK_CI is being sent as a response to a command. This means
>>> that the ppm_lock should be locked. The UCSI_GET_CONNECTOR_STATUS
>>> command should wait for ppm_lock to be freed and only then it can
>>> proceed with sending the command. Maybe I'm misunderstanding the case
>>> or maybe there is a loophole somewhere.
>>
>> There's probably also some miss-understanding at my end, I don't get how
>> the ppm_lock would protext from non atomic async_control()/clear_bit().
>>
>> Let me share some trace here, I hope it can better show the difference
>> at my end when I get the race.
>> I've added some debug prints around these lines, to trace the set/clear
>> of the COMMAND/ACK_PENDING, main commands and cci flags.
>>
>> normal case is:
>> ---
>> ucsi_send_command_common: UCSI_SET_UOR
>> set:COMMAND_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
>> clr:COMMAND_PENDING
>> ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
>> set:ACK_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
>> clr:ACK_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: ucsi_connector_change
>> ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
>> set:COMMAND_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
>> clr:COMMAND_PENDING
>> ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE
>> set:ACK_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
>> clr:ACK_PENDING
>>
>> racy case:
>> ---
>> ucsi_send_command_common: UCSI_SET_UOR
>> set:COMMAND_PENDING
>> ucsi_stm32g0_irq_handler
>> ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE
>> clr:COMMAND_PENDING
>> ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE
>> set:ACK_PENDING
>> ucsi_stm32g0_irq_handler              <-- up to here nothing supicious
>> ucsi_notify_common: ucsi_connector_change
>> ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
>> ucsi_stm32g0_irq_handler              <-- 2nd IRQ before clr:ACK_PENDING
>> ucsi_notify_common: ucsi_connector_change
>> ucsi_notify_common: UCSI_CCI_ACK_COMPLETE
>> clr:ACK_PENDING
>> ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS
>> set:COMMAND_PENDING                   <-- completion already done :-(
>> clr:COMMAND_PENDING
>> ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-5)
>>
>> I notice the two conditions are met above: both ucsi_connector_change()
>> and cci ack completed.
>>
>> This happens before clear_bit(ACK_PENDING), which lead to anticipate
>> completion of the subsequent UCSI_GET_CONNECTOR_STATUS command.
>>
>> So the test_and_clear_bit() would address a robustness case?
> 
> Ah, I see. So the race is for the completion. I fear that your solution
> also doesn't fully solve the problem. The event can arrive after setting
> the bit and before sending the command to the hardware. I thought about
> doing various tricks, like reinit_completion or something close, but the
> issue is that test_and_clear_bit() just makes the race window smaller,
> but doesn't eliminate it completely.

Hi Dmitry,

I've done some tests with reinit_completion(), when entering the
ucsi_sync_control_common() routine:

+	reinit_completion(&ucsi->complete);

	if (ack)
		set_bit(ACK_PENDING, &ucsi->flags);
	else
		set_bit(COMMAND_PENDING, &ucsi->flags);


This indeed avoids the race on the completion at my end.
With that, I longer get the error on the subsequent
UCSI_GET_CONNECTOR_STATUS command.

It looks like a better compromise to me, than test_and_clear_bit() as
you pointed out.

Best Regards,
Fabrice

> 
> It seems the only practical way to handle that is by making sure that
> ucsi_notify_common() can sleep and locks the ppm_lock while
> processing the CCI.
> 
>>
>> Please advise,
>> Best Regards,
>> Fabrice
>>
>>>
>>>> This is where the test_and_clear_bit() atomic operation helps, to avoid
>>>> non atomic operation:
>>>>
>>>> -> async_control(UCSI_ACK_CC_CI)
>>>> new interrupt may occur here
>>>> -> clear_bit(ACK_PENDING)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ