lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e22d4f8e-0ca7-056e-e5ec-4fc97cbaf08b@quicinc.com>
Date:   Thu, 18 Aug 2022 11:17:24 -0700
From:   Elson Serrao <quic_eserrao@...cinc.com>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        "balbi@...nel.org" <balbi@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>,
        "quic_mrana@...cinc.com" <quic_mrana@...cinc.com>
Subject: Re: [PATCH 2/5] usb: gadget: Add function wakeup support



On 8/16/2022 4:51 PM, Thinh Nguyen wrote:
> On 8/16/2022, Elson Serrao wrote:
>>
>>
>> On 8/12/2022 5:46 PM, Thinh Nguyen wrote:
>>> On 8/11/2022, Thinh Nguyen wrote:
>>>> On 8/11/2022, Thinh Nguyen wrote:
>>>>> On 8/11/2022, Elson Serrao wrote:
>>>>>>
>>>>>>
>>>>>> On 8/9/2022 6:08 PM, Thinh Nguyen wrote:
>>>>>
>>>>> <snip>
>>>>>
>>>>>
>>>>>>> To summarize the points:
>>>>>>>
>>>>>>> 1) The host only arms function remote wakeup if the device is
>>>>>>> capable of
>>>>>>> remote wakeup (check USB_CONFIG_ATT_WAKEUP in bmAttributes and
>>>>>>> hardware
>>>>>>> capability)
>>>>>>>
>>>>>>> 2) If the device is in suspend, the device can do remote wakeup
>>>>>>> (through
>>>>>>> LFPS handshake) if the function is armed for remote wakeup (through
>>>>>>> SET_FEATURE(FUNC_SUSPEND)).
>>>>>>>
>>>>>>> 3) If the link transitions to U0 after the device triggering a remote
>>>>>>> wakeup, the device will also send device notification function
>>>>>>> wake for
>>>>>>> all the interfaces armed with remote wakeup.
>>>>>>>
>>>>>>> 4) If the device is not in suspend, the device can send device
>>>>>>> notification function wake if it's in U0.
>>>>>>>
>>>>>>>
>>>>>>> Now, remote wakeup and function wake device notification are 2
>>>>>>> separate
>>>>>>> operations. We have the usb_gadget_ops->wakeup() for remote wakeup. I
>>>>>>> suggested to maybe add
>>>>>>> usb_gadget_ops->send_wakeup_notification(gadget,
>>>>>>> intf_id) for the device notification. What you did was combining both
>>>>>>> operations in usb_gadget_ops->func_wakeup(). That may only work for
>>>>>>> point 4) (assuming you fix the U0 check), but not point 3).
>>>>>>
>>>>>> Thank you for your feedback and summary. I will rename func_wakeup to
>>>>>> send_wakeup_notification to better align with the approach. The
>>>>>> reason I
>>>>>> have combined remote_wakeup and function wake notification in
>>>>>> usb_gadget_ops->func_wakeup() is because since the implementation
>>>>>> is at
>>>>>> function/composite level it has no knowledge on the link state. So I
>>>>>> have delegated that task to controller driver to handle the
>>>>>> notification
>>>>>> accordingly. That is do a LFPS handshake first if the device is
>>>>>> suspended and then send notification (explained below). But we can
>>>>>> definitely separate this by adding an additional flag in the composite
>>>>>> layer to set the link state based on the gadget suspend callback
>>>>>> called
>>>>>> when U3 SUSPEND interrupt is received. Let me know if you feel
>>>>>> separating the two is a better approach.
>>>>>>
>>>>>
>>>>> The reason I think we need to separate it is because of point 3. As I
>>>>> note earlier, the spec states that "If remote wake event occurs in
>>>>> multiple functions, each function shall send a Function Wake
>>>>> Notification."
>>>>>
>>>>> But if there's no remote wake event, and the host brought the device up
>>>>> instead, then the function suspend state is retained.
>>>>>
>>>>> If we separate these 2 operations, the caller can check whether the
>>>>> operation went through properly. For example, if the wakeup() is
>>>>> initiated properly, but the function wake device notification didn't go
>>>>> through. We would only need to resend the device notification rather
>>>>> than initiate remote wakeup again.
>>>>
>>>> If we don't have to send device notification for other interfaces, we
>>>> can combine the operations here as you did.
>>>>
>>>
>>> I still think it's better to split up the operations. The way you're
>>> handling it now is not clear.
>>>
>>> If the func_awake() returns -EAGAIN, I'd expect that the remote wake did
>>> not go through and expect user to retry again. But here it does initiate
>>> remote wake, but it just does not send device notification yet. This is
>>> confusing.
>>>
>>> Also, instead of all the function wake handling coming from the function
>>> driver, now we depend on the controller driver to call function resume()
>>> on state change to U0, which will trigger device notification. What
>>> happen if it doesn't call resume(). There's too many dependencies and it
>>> seems fragile.
>>>
>>> I think all this can be handled in the function driver. You can fix the
>>> dwc3 wakeup() and poll for U0/ON state rather than RECOVERY state, which
>>> is what it's supposed to poll.
>>
>> For transitioning from U3 to U0, the current upstream implementation is
>> to poll for U0 state when dwc3_gadget_wakeup() is called and it is a
>> blocking call. (this is a common API for both HS and SS)
>>
>>      /* poll until Link State changes to ON */
>>      retries = 20000;
>>
>>      while (retries--) {
>>          reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>
>>          /* in HS, means ON */
>>          if (DWC3_DSTS_USBLNKST(reg) == DWC3_LINK_STATE_U0)
>>              break;
>>      }
>>
>> In my experiments I found that certain hosts take longer time to drive
>> HS resume signalling between the remote wakeup and the resume K and this
>> time varies across hosts. Hence the above polling timer is not generic
>> across hosts. So how do we converge on a polling timer value to work
>> across HS/SS and without blocking the system for a long time?
> 
> Can't we take the upper limit of both base on experiment? And it
> shouldn't be blocking the whole system.

On the host I was experimenting with, the time it took was around 110ms 
in HS case. That would translate to a retry count of about ~181,000 in 
the above polling loop. Wouldn't that be a very large value for polling?
And not sure if there are other hosts that take even longer time
> 
>>
>> Some data layers like TCP/IP hold a TX lock while sending data (that
>> causes a remote wakeup event) and hence this wakeup() may run in atomic
>> context.
> 
> Why hold the lock while waiting for the host to wakeup? The host is
> still inactive. Also, the usb_gadget_wakeup() API doesn't specify that
> it may run in atomic context.
> 

The lock might be held by upper layers who are unaware/independent of 
underlying transport medium. The above TX lock I was referring to was
that held by Linux networking stack. It just pushes out data the same 
way it would when USB is active. It is the function/composite layer 
being aware of the function suspend would now sense this as a remote 
wake event and perform this additional step of bringing out the link 
from u3 and then sending device wakeup notification.

In our current upstream implementation of dwc3_gadget_wakeup() API we 
hold a spinlock as well. But yeah that can be rectified

static int dwc3_gadget_wakeup(struct usb_gadget *g)
{
	struct dwc3		*dwc = gadget_to_dwc(g);
	unsigned long		flags;
	int			ret;

	spin_lock_irqsave(&dwc->lock, flags);
	ret = __dwc3_gadget_wakeup(dwc);
	spin_unlock_irqrestore(&dwc->lock, flags);

	return ret;
}


>>
>> To make this generic across hosts, I had switched to interrupt based
>> approach, enabling link state events and return error value immediately
>> from the dwc3_gadget_wakeup() API after doing a LFPS handshake. But
>> yeah, then we have to rely on the resume callback as an indication that
>> link is transitioned to ON state.
>>
> 
> BR,
> Thinh
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ