lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e6867cc-489a-b626-ff9c-79615613b2dd@quicinc.com>
Date:   Wed, 20 Jul 2022 11:50:58 -0700
From:   Wesley Cheng <quic_wcheng@...cinc.com>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        "balbi@...nel.org" <balbi@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>
Subject: Re: [PATCH v2 3/5] usb: dwc3: gadget: Adjust IRQ management during
 soft disconnect/connect

Hi Thinh,

On 7/14/2022 10:38 AM, Thinh Nguyen wrote:
> On 7/12/2022, Wesley Cheng wrote:
>> Local interrupts are currently being disabled as part of aquiring the
>> spin lock before issuing the endxfer command.  Leave interrupts enabled, so
>> that EP0 events can continue to be processed.  Also, ensure that there are
>> no pending interrupts before attempting to handle any soft
>> connect/disconnect.
>>
>> Fixes: 861c010a2ee1 ("usb: dwc3: gadget: Refactor pullup()")
>> Signed-off-by: Wesley Cheng <quic_wcheng@...cinc.com>
>> ---
>>    drivers/usb/dwc3/gadget.c | 21 ++++++++++++---------
>>    1 file changed, 12 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index a455f8d4631d..ee85b773e3fe 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -1674,6 +1674,7 @@ static int __dwc3_gadget_get_frame(struct dwc3 *dwc)
>>    static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt)
>>    {
>>    	struct dwc3_gadget_ep_cmd_params params;
>> +	struct dwc3 *dwc = dep->dwc;
>>    	u32 cmd;
>>    	int ret;
>>    
>> @@ -1682,7 +1683,9 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
>>    	cmd |= interrupt ? DWC3_DEPCMD_CMDIOC : 0;
>>    	cmd |= DWC3_DEPCMD_PARAM(dep->resource_index);
>>    	memset(&params, 0, sizeof(params));
>> +	spin_unlock(&dwc->lock);
>>    	ret = dwc3_send_gadget_ep_cmd(dep, cmd, &params);
>> +	spin_lock(&dwc->lock);
>>    	WARN_ON_ONCE(ret);
>>    	dep->resource_index = 0;
>>    
>> @@ -2029,12 +2032,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>>    	struct dwc3_ep			*dep = to_dwc3_ep(ep);
>>    	struct dwc3			*dwc = dep->dwc;
>>    
>> -	unsigned long			flags;
>>    	int				ret = 0;
>>    
>>    	trace_dwc3_ep_dequeue(req);
>>    
>> -	spin_lock_irqsave(&dwc->lock, flags);
>> +	spin_lock(&dwc->lock);
>>    
>>    	list_for_each_entry(r, &dep->cancelled_list, list) {
>>    		if (r == req)
>> @@ -2073,7 +2075,7 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>>    		request, ep->name);
>>    	ret = -EINVAL;
>>    out:
>> -	spin_unlock_irqrestore(&dwc->lock, flags);
>> +	spin_unlock(&dwc->lock);
>>    
>>    	return ret;
>>    }
>> @@ -2489,9 +2491,7 @@ static int __dwc3_gadget_start(struct dwc3 *dwc);
>>    
>>    static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>    {
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&dwc->lock, flags);
>> +	spin_lock(&dwc->lock);
>>    	dwc->connected = false;
>>    
>>    	/*
>> @@ -2506,10 +2506,10 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>    
>>    		reinit_completion(&dwc->ep0_in_setup);
>>    
>> -		spin_unlock_irqrestore(&dwc->lock, flags);
>> +		spin_unlock(&dwc->lock);
>>    		ret = wait_for_completion_timeout(&dwc->ep0_in_setup,
>>    				msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT));
>> -		spin_lock_irqsave(&dwc->lock, flags);
>> +		spin_lock(&dwc->lock);
>>    		if (ret == 0)
>>    			dev_warn(dwc->dev, "timed out waiting for SETUP phase\n");
>>    	}
>> @@ -2523,7 +2523,7 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>>    	 */
>>    	dwc3_stop_active_transfers(dwc);
>>    	__dwc3_gadget_stop(dwc);
>> -	spin_unlock_irqrestore(&dwc->lock, flags);
>> +	spin_unlock(&dwc->lock);
>>    
>>    	/*
>>    	 * Note: if the GEVNTCOUNT indicates events in the event buffer, the
>> @@ -2569,6 +2569,8 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>    		return 0;
>>    	}
>>    
>> +	synchronize_irq(dwc->irq_gadget);
>> +
>>    	if (!is_on) {
>>    		ret = dwc3_gadget_soft_disconnect(dwc);
>>    	} else {
>> @@ -3729,6 +3731,7 @@ void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
>>    	 */
>>    
>>    	__dwc3_stop_active_transfer(dep, force, interrupt);
>> +
>>    }
>>    
>>    static void dwc3_clear_stall_all_ep(struct dwc3 *dwc)
> 
> Hi Greg,
> 
> Please don't pick up this patch yet. We're still in discussion with
> this. I have some concern with unlocking/locking when sending End
> Transfer command. For example, this patch may cause issues with
> DWC3_EP_END_TRANSFER_PENDING checks.
> 
> Hi Wesley,
> 
> Did you try out my suggestion yet?
> 

Just providing a quick update.

So with your suggestion, I was able to consistently reproduce the 
controller halt issue after a day or so of testing.  However, when I 
took a further look, I believe the problem is due to the DWC3 event handler:

static void dwc3_endpoint_interrupt(struct dwc3 *dwc,
		const struct dwc3_event_depevt *event)
{
...
	if (!(dep->flags & DWC3_EP_ENABLED)) {
		if (!(dep->flags & DWC3_EP_TRANSFER_STARTED))
			return;

		/* Handle only EPCMDCMPLT when EP disabled */
		if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT)
			return;
	}

The soft disconnect routine reached to the run/stop polling point, and I 
could see that DWC3_EP_DELAYED_STOP was set, and we got a xfercomplete 
event for the STATUS phase.  However, since we exit early in the event 
handler (due to __dwc3_gadget_stop() being called and disabling EP0), 
the STATUS complete is never handled, and we do not issue the endxfer 
command.

I don't think I saw this issue with my change, as we allowed the STATUS 
phase handling to happen BEFORE gadget stop was called (since I released 
the lock in the stop active transfers API).

However, I think even with my approach, we'd eventually run into a 
possibility of this issue, as we aren't truly handling EP0 events while 
polling for the halted status due to the above.  It was just reducing 
the chances.  The scenario of this issue is coming because the host took 
a long time to complete the STATUS phase, so we ran into a "timed out 
waiting for SETUP phase," which allowed us to call the run/stop routine 
while we were not yet in the SETUP phase.

I'm currently running a change to add a EP num check to this IF condition:

	if ((epnum > 1) && !(dep->flags & DWC3_EP_ENABLED)) {
		if (!(dep->flags & DWC3_EP_TRANSFER_STARTED))
			return;

		/* Handle only EPCMDCMPLT when EP disabled */
		if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT)
			return;
	}

Thanks
Wesley Cheng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ