lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9a542032-2d2c-3241-6414-f92d5596c222@quicinc.com>
Date:   Wed, 5 Apr 2023 17:10:34 -0700
From:   Wesley Cheng <quic_wcheng@...cinc.com>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>
CC:     "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>,
        "quic_ugoswami@...cinc.com" <quic_ugoswami@...cinc.com>
Subject: Re: [PATCH] usb: dwc3: gadget: Stall and restart EP0 if host is
 unresponsive

Hi Thinh,

On 4/5/2023 11:49 AM, Thinh Nguyen wrote:
> On Tue, Apr 04, 2023, Wesley Cheng wrote:
>> Hi Thinh,
>>
>> On 4/4/2023 3:16 PM, Thinh Nguyen wrote:
>>> On Tue, Apr 04, 2023, Wesley Cheng wrote:
>>>> Hi Thinh,
>>>>
>>>> On 4/3/2023 6:11 PM, Thinh Nguyen wrote:
>>>>> On Fri, Mar 31, 2023, Wesley Cheng wrote:
>>>>>> It was observed that there are hosts that may complete pending SETUP
>>>>>> transactions before the stop active transfers and controller halt occurs,
>>>>>> leading to lingering endxfer commands on DEPs on subsequent pullup/gadget
>>>>>> start iterations.
>>>>>
>>>>> Can you clarify this a bit further? Even though the controller is
>>>>> halted, you still observed activity?
>>>>>
>>>>
>>>> Yes...I didn't understand how that was possible either, but traces clearly
>>>> showed that the controller halt was successful even though there were no
>>>> endxfers issued on some EPs.  Although, I can't say for certain if those EPs
>>>> were actively being used at that time.
>>>>
>>>
>>> The controller should only be halted after the (non-ep0) endpoints are
>>> disabled.
>>>
>>> "even though there were no endxferx issued on some EPs", which EPs are
>>> you referring to? If there's no End Transfer issued while the endpoints
>>> are active and started during disconnect, we need to fix that in the
>>> driver.
>>>
>>
>> Sorry let me clarify.  When I said there were no endxfers issued, I meant
>> that they were pending (DWC3_EP_DELAY_STOP is set for those EPs).  However,
>> since the host wasn't moving the EP0 state forward, we never moved back to
>> the SETUP phase, which is where we flush any pending end transfers.
>>
>> void dwc3_ep0_out_start(struct dwc3 *dwc)
>> {
>> ...
>> 	for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
>> 		struct dwc3_ep *dwc3_ep;
>>
>> 		dwc3_ep = dwc->eps[i];
>> 		if (!dwc3_ep)
>> 			continue;
>>
>> 		if (!(dwc3_ep->flags & DWC3_EP_DELAY_STOP))
>> 			continue;
>>
>> 		dwc3_ep->flags &= ~DWC3_EP_DELAY_STOP;
>> 		if (dwc->connected)
>> 			dwc3_stop_active_transfer(dwc3_ep, true, true);
>> 		else
>> 			dwc3_remove_requests(dwc, dwc3_ep, -ESHUTDOWN);
>> 	}
>> }
>>
>> This is part of the reason for moving the wait_for_completion() call until
>> AFTER the stop active transfers, since that is the point at which we could
>> potentially set the DWC3_EP_DELAY_STOP.  If there is a host not moving the
>> EP0 state, then we can at least utilize the timeout path to force EP0 back
>> to the setup phase.
>>
> 
> Thanks for the clarification. If I understand correctly, the issue here
> is the missing handling of the timeout of the wait for control transfer
> completion. This can happen if we enter flow control and/or if the host
> delays polling for data. If it times out, then we'll have problems as
> you mention above.
> 
> In that case, we can halt the control endpoint causing it to respond
> with a STALL to the next host polling of data, cancelling the control
> transfer. This is what you did with dwc3_ep0_reset_state() on timeout
> right?
> 

Yep, that is correct.  In addition, I also made sure to call gadget stop 
at the end of the soft disconnect API, since we wanted to still service 
EP0 events while polling for the controller halt, right?  If we call 
gadget stop, then we clear the event enable register, so no events are 
going to occur:

static void __dwc3_gadget_stop(struct dwc3 *dwc)
{
	dwc3_gadget_disable_irq(dwc);
	__dwc3_gadget_ep_disable(dwc->eps[0]);
	__dwc3_gadget_ep_disable(dwc->eps[1]);
}

static void dwc3_gadget_disable_irq(struct dwc3 *dwc)
{
	/* mask all interrupts */
	dwc3_writel(dwc->regs, DWC3_DEVTEN, 0x00);
}

I think I'll split this change up as well from the timeout related one.

> I think this should work as the End Transfer commands should complete
> before the controller is halted.
> 

Yes, anyway we poll for the halted status, and in the changes for the 
endxfer timeouts we increased the poll time to allow for them to complete.

Thanks
Wesley Cheng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ