lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220915020602.tsdrw6cbmpqsypuf@synopsys.com>
Date:   Thu, 15 Sep 2022 02:06:32 +0000
From:   Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To:     Elson Serrao <quic_eserrao@...cinc.com>
CC:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        "balbi@...nel.org" <balbi@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>,
        "quic_mrana@...cinc.com" <quic_mrana@...cinc.com>
Subject: Re: [PATCH 2/5] usb: gadget: Add function wakeup support

On Tue, Sep 13, 2022, Elson Serrao wrote:
> 
> 
> On 8/25/2022 6:30 PM, Thinh Nguyen wrote:
> > On Tue, Aug 23, 2022, Elson Serrao wrote:
> > > On 8/22/2022 6:01 PM, Thinh Nguyen wrote:
> > > > On Thu, Aug 18, 2022 at 11:17:24AM -0700, Elson Serrao wrote:
> > > > > 
> > > > > 
> > > > > On 8/16/2022 4:51 PM, Thinh Nguyen wrote:
> > > > > > On 8/16/2022, Elson Serrao wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 8/12/2022 5:46 PM, Thinh Nguyen wrote:
> > > > > > > > On 8/11/2022, Thinh Nguyen wrote:
> > > > > > > > > On 8/11/2022, Thinh Nguyen wrote:
> > > > > > > > > > On 8/11/2022, Elson Serrao wrote:
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > On 8/9/2022 6:08 PM, Thinh Nguyen wrote:
> > > > > > > > > > 
> > > > > > > > > > <snip>
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > > To summarize the points:
> > > > > > > > > > > > 
> > > > > > > > > > > > 1) The host only arms function remote wakeup if the device is
> > > > > > > > > > > > capable of
> > > > > > > > > > > > remote wakeup (check USB_CONFIG_ATT_WAKEUP in bmAttributes and
> > > > > > > > > > > > hardware
> > > > > > > > > > > > capability)
> > > > > > > > > > > > 
> > > > > > > > > > > > 2) If the device is in suspend, the device can do remote wakeup
> > > > > > > > > > > > (through
> > > > > > > > > > > > LFPS handshake) if the function is armed for remote wakeup (through
> > > > > > > > > > > > SET_FEATURE(FUNC_SUSPEND)).
> > > > > > > > > > > > 
> > > > > > > > > > > > 3) If the link transitions to U0 after the device triggering a remote
> > > > > > > > > > > > wakeup, the device will also send device notification function
> > > > > > > > > > > > wake for
> > > > > > > > > > > > all the interfaces armed with remote wakeup.
> > > > > > > > > > > > 
> > > > > > > > > > > > 4) If the device is not in suspend, the device can send device
> > > > > > > > > > > > notification function wake if it's in U0.
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Now, remote wakeup and function wake device notification are 2
> > > > > > > > > > > > separate
> > > > > > > > > > > > operations. We have the usb_gadget_ops->wakeup() for remote wakeup. I
> > > > > > > > > > > > suggested to maybe add
> > > > > > > > > > > > usb_gadget_ops->send_wakeup_notification(gadget,
> > > > > > > > > > > > intf_id) for the device notification. What you did was combining both
> > > > > > > > > > > > operations in usb_gadget_ops->func_wakeup(). That may only work for
> > > > > > > > > > > > point 4) (assuming you fix the U0 check), but not point 3).
> > > > > > > > > > > 
> > > > > > > > > > > Thank you for your feedback and summary. I will rename func_wakeup to
> > > > > > > > > > > send_wakeup_notification to better align with the approach. The
> > > > > > > > > > > reason I
> > > > > > > > > > > have combined remote_wakeup and function wake notification in
> > > > > > > > > > > usb_gadget_ops->func_wakeup() is because since the implementation
> > > > > > > > > > > is at
> > > > > > > > > > > function/composite level it has no knowledge on the link state. So I
> > > > > > > > > > > have delegated that task to controller driver to handle the
> > > > > > > > > > > notification
> > > > > > > > > > > accordingly. That is do a LFPS handshake first if the device is
> > > > > > > > > > > suspended and then send notification (explained below). But we can
> > > > > > > > > > > definitely separate this by adding an additional flag in the composite
> > > > > > > > > > > layer to set the link state based on the gadget suspend callback
> > > > > > > > > > > called
> > > > > > > > > > > when U3 SUSPEND interrupt is received. Let me know if you feel
> > > > > > > > > > > separating the two is a better approach.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > The reason I think we need to separate it is because of point 3. As I
> > > > > > > > > > note earlier, the spec states that "If remote wake event occurs in
> > > > > > > > > > multiple functions, each function shall send a Function Wake
> > > > > > > > > > Notification."
> > > > > > > > > > 
> > > > > > > > > > But if there's no remote wake event, and the host brought the device up
> > > > > > > > > > instead, then the function suspend state is retained.
> > > > > > > > > > 
> > > > > > > > > > If we separate these 2 operations, the caller can check whether the
> > > > > > > > > > operation went through properly. For example, if the wakeup() is
> > > > > > > > > > initiated properly, but the function wake device notification didn't go
> > > > > > > > > > through. We would only need to resend the device notification rather
> > > > > > > > > > than initiate remote wakeup again.
> > > > > > > > > 
> > > > > > > > > If we don't have to send device notification for other interfaces, we
> > > > > > > > > can combine the operations here as you did.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > I still think it's better to split up the operations. The way you're
> > > > > > > > handling it now is not clear.
> > > > > > > > 
> > > > > > > > If the func_awake() returns -EAGAIN, I'd expect that the remote wake did
> > > > > > > > not go through and expect user to retry again. But here it does initiate
> > > > > > > > remote wake, but it just does not send device notification yet. This is
> > > > > > > > confusing.
> > > > > > > > 
> > > > > > > > Also, instead of all the function wake handling coming from the function
> > > > > > > > driver, now we depend on the controller driver to call function resume()
> > > > > > > > on state change to U0, which will trigger device notification. What
> > > > > > > > happen if it doesn't call resume(). There's too many dependencies and it
> > > > > > > > seems fragile.
> > > > > > > > 
> > > > > > > > I think all this can be handled in the function driver. You can fix the
> > > > > > > > dwc3 wakeup() and poll for U0/ON state rather than RECOVERY state, which
> > > > > > > > is what it's supposed to poll.
> > > > > > > 
> > > > > > > For transitioning from U3 to U0, the current upstream implementation is
> > > > > > > to poll for U0 state when dwc3_gadget_wakeup() is called and it is a
> > > > > > > blocking call. (this is a common API for both HS and SS)
> > > > > > > 
> > > > > > >        /* poll until Link State changes to ON */
> > > > > > >        retries = 20000;
> > > > > > > 
> > > > > > >        while (retries--) {
> > > > > > >            reg = dwc3_readl(dwc->regs, DWC3_DSTS);
> > > > > > > 
> > > > > > >            /* in HS, means ON */
> > > > > > >            if (DWC3_DSTS_USBLNKST(reg) == DWC3_LINK_STATE_U0)
> > > > > > >                break;
> > > > > > >        }
> > > > > > > 
> > > > > > > In my experiments I found that certain hosts take longer time to drive
> > > > > > > HS resume signalling between the remote wakeup and the resume K and this
> > > > > > > time varies across hosts. Hence the above polling timer is not generic
> > > > > > > across hosts. So how do we converge on a polling timer value to work
> > > > > > > across HS/SS and without blocking the system for a long time?
> > > > > > 
> > > > > > Can't we take the upper limit of both base on experiment? And it
> > > > > > shouldn't be blocking the whole system.
> > > > > 
> > > > > On the host I was experimenting with, the time it took was around 110ms in
> > > > > HS case. That would translate to a retry count of about ~181,000 in the
> > > > > above polling loop. Wouldn't that be a very large value for polling?
> > > > > And not sure if there are other hosts that take even longer time
> > > > 
> > > > We don't want to poll that many times. We shouldn't depend on the delay
> > > > of a register read for polling interval. Can't we just sleep in between
> > > > interval at a reasonable interval.
> > > > 
> > > 
> > > Sleeping is not an option as the upper layers (those beyond
> > > function/composite layer) may transmit data with a lock held.
> > > 
> > 
> > You can use mdelay() if it can't sleep. But if the wait is long, it
> > should be allowed to sleep.
> > 
> > > I ran into below BUG when remote wakeup was triggered via a ping() command
> > > and attempted sleep while polling
> > > 
> > > [   88.676789][  T392] BUG: scheduling while atomic
> > > [   88.900112][  T392] Call trace:
> > > <snip>
> > > [   88.912760][  T392]  __schedule_bug+0x90/0x188
> > > [   88.917335][  T392]  __schedule+0x714/0xb4c
> > > [   88.930568][  T392]  schedule+0x110/0x204
> > > [   88.943620][  T392]  schedule_timeout+0x94/0x134
> > > [   88.948371][  T392]  __dwc3_gadget_wakeup+0x1ac/0x558
> > > [   88.953558][  T392]  dwc3_gadget_wakeup+0x3c/0x8c
> > > [   88.958388][  T392]  usb_gadget_wakeup+0x54/0x1a8
> > > [   88.963222][  T392]  eth_start_xmit+0x130/0x830
> > > [   88.967876][  T392]  xmit_one+0xf0/0x364
> > > [   88.971913][  T392]  sch_direct_xmit+0x188/0x3e4
> > > [   88.976663][  T392]  __dev_xmit_skb+0x480/0x984
> > > [   88.981319][  T392]  __dev_queue_xmit+0x2d4/0x7d8
> > > [   88.986151][  T392]  neigh_resolve_output+0x208/0x2f0
> > > <snip>
> > > 
> > > The above experiment was done by removing spin_locks if any in the wakeup()
> > > path of  function/composite/controller drivers. It is running in atomic
> > > context due to the lock held by linux networking stack/generic packet
> > > scheduler.
> > > 
> > > So below are the only two approaches I can think of for dwc3_gadget_wakeup()
> > > API
> > > 
> > > 1.)Polling based approach: We poll until the link comes up. But cannot sleep
> > > while polling due to above reasons.
> > > 
> > > 2.)Interrupt based approach (the patches being discussed currently): When a
> > > remote wakeup is triggered enable link state interrupts and return right
> > > away. The function/composite drivers are later notified about the ON event
> > > via resume callback (in a similar way how we notify U3 to composite layer
> > > via suspend callback).
> > > 
> > > Please let me know if there is any alternate way that you can think of here.
> > > 
> > 
> > The main issue we're trying to solve is knowing if the host had woken up
> > and the device notification is sent so that the function driver can
> > resume().
> > 
> > If we can say that upon usb_gadget_wakeup() successful completion, the
> > link is in U0/ON, then the function driver can send the wake
> > notification after and resume(). That is, we're trying to make
> > usb_gadget_wakeup() synchronous. Whether it's polling or interrupt
> > based, it's implementation detail.
> > 
> > Unfortunately, the API isn't very clear whether usb_gadget_wakeup() may
> > sleep or synchronous.
> > 
> > Here are 3 approaches that we can have:
> > 
> > 1) Clarify that usb_gadget_wakeup() is synchronous and the link will be
> > in U0/ON upon successful completion, and clarify whether it can sleep.
> > Introduce a separate API usb_gadget_send_wake_notification() to send
> > wake notification separately.
> > 
> > 2) Create a new API usb_gadget_function_wakeup() that will combine both
> > device wakeup and wake notification. The function can sleep,
> > synchronous, and both wakeup and wake notification are done after the
> > function completes.
> > 
> > 3) Create a new API usb_gadget_function_wakeup() that will combine both
> > device wakeup and wake notification. The function is asynchronous. We
> > won't know if the wakeup is successfully sent, but we don't care and
> > proceed with the function proceed with resume() anyway.
> > 
> > BR,
> > Thinh
> 
> Thank you for your suggestions.
> For handling function wakeup (applicable to enhanced super-speed) will
> implement a separate API usb_gadget_function_wakeup() that combines
> device-wakeup and wake-notification like below in dwc3 driver and operates
> synchronously.
> 
> dwc3_gadget_func_wakeup()
> {
> 	if (link in U3)
> 		Call dwc3_gadget_wakeup()
> 		Poll for U0
> 		
> 	
> 	If U0 successful, send wake_notification
> 
> }
> 
> Once the function completes both device wake and func wakeup notification
> are done.
> 
> For high speed use-cases when usb_gadget_wakeup() is called from function
> drivers, considering the possibility of significant delay associated with
> remote wakeup, dwc3_gadget_wakeup() should operate asynchronously.
> i.e rely on link status change events rather than sleeping/polling.
> 
> Please let me know if there are any concerns. If not will upload new patch
> sets with this change and other changes suggested.
> 

That sounds good to me.

Thanks,
Thinh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ