lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877dsjei8j.fsf@kernel.org>
Date:   Thu, 24 Sep 2020 10:39:24 +0300
From:   Felipe Balbi <balbi@...nel.org>
To:     Wesley Cheng <wcheng@...eaurora.org>, gregkh@...uxfoundation.org,
        Thinh.Nguyen@...opsys.com, Alan Stern <stern@...land.harvard.edu>
Cc:     linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
        jackp@...eaurora.org
Subject: Re: [PATCH v3] usb: dwc3: Stop active transfers before halting the
 controller


Hi,

Wesley Cheng <wcheng@...eaurora.org> writes:
> On 9/6/2020 11:20 PM, Felipe Balbi wrote:
>> 
>> Hi,
>> 
>> Wesley Cheng <wcheng@...eaurora.org> writes:
>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
>>> index 59f2e8c31bd1..456aa87e8778 100644
>>> --- a/drivers/usb/dwc3/ep0.c
>>> +++ b/drivers/usb/dwc3/ep0.c
>>> @@ -197,7 +197,7 @@ int dwc3_gadget_ep0_queue(struct usb_ep *ep, struct usb_request *request,
>>>  	int				ret;
>>>  
>>>  	spin_lock_irqsave(&dwc->lock, flags);
>>> -	if (!dep->endpoint.desc) {
>>> +	if (!dep->endpoint.desc || !dwc->pullups_connected) {
>> 
>> this looks odd. If we don't have pullups connected, we shouldn't have a
>> descriptor, likewise if we don't have a a description, we haven't been
>> enumerated, therefore we shouldn't have pullups connected.
>> 
>> What am I missing here?
>> 
>
> Hi Felipe,
>
> When we
> echo "" > /sys/kernel/config/usb_gadget/g1/UDC
>
> This triggers the usb_gadget_disconnect() routine to execute.
>
> int usb_gadget_disconnect(struct usb_gadget *gadget)
> {
> ...
> 	ret = gadget->ops->pullup(gadget, 0);
> 	if (!ret) {
> 		gadget->connected = 0;
> 		gadget->udc->driver->disconnect(gadget);
> 	}
>
> So it is possible that we've already disabled the pullup before running
> the disable() callbacks in the function drivers.  The disable()

we used to have usage counts for those, are they gone? I think they're
still there.

> callbacks usually are the ones responsible for calling usb_ep_disable(),
> where we clear the desc field.  This means there is a brief period where
> the pullups_connected = 0, but we still have valid ep desc, as it has
> not been disabled yet.

this is a valid point, though

> Also, for function drivers like mass storage, the fsg_disable() routine
> defers the actual usb_ep_disable() call to the fsg_thread, so its not
> always ensured that the disconnect() execution would result in the
> usb_ep_disable() to occur synchronously.

also a good point.

>>> @@ -1926,6 +1926,21 @@ static int dwc3_gadget_set_selfpowered(struct usb_gadget *g,
>>>  	return 0;
>>>  }
>>>  
>>> +static void dwc3_stop_active_transfers(struct dwc3 *dwc)
>>> +{
>>> +	u32 epnum;
>>> +
>>> +	for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>> 
>> dwc3 knows the number of endpoints available in the HW. Use dwc->num_eps
>> instead.
>> 
>
> Sure, will do.
>
>>> @@ -1971,6 +1986,8 @@ static int dwc3_gadget_run_stop(struct dwc3 *dwc, int is_on, int suspend)
>>>  	return 0;
>>>  }
>>>  
>>> +static void __dwc3_gadget_stop(struct dwc3 *dwc);
>>> +
>>>  static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>>  {
>>>  	struct dwc3		*dwc = gadget_to_dwc(g);
>>> @@ -1994,9 +2011,37 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>>  		}
>>>  	}
>>>  
>>> +	/*
>>> +	 * Synchronize and disable any further event handling while controller
>>> +	 * is being enabled/disabled.
>>> +	 */
>>> +	disable_irq(dwc->irq_gadget);
>> 
>> why isn't dwc3_gadget_disable_irq() enough?
>> 
>>>  	spin_lock_irqsave(&dwc->lock, flags);
>> 
>> spin_lock_irqsave() will disable interrupts, why disable_irq() above?
>> 
>
> In the discussion I had with Thinh, the concern was that with the newly
> added code to override the lpos here, if the interrupt routine
> (dwc3_check_event_buf()) runs, then it will reference the lpos for

that's running in hardirq context. All interrupts are disabled while
that runs, there's no risk of race, right?

> copying the event buffer contents to the event cache, and potentially
> process events.  There is no locking in place, so it could be possible
> to have both run in parallel.

Is this academic or have you actually found a situation where this
could, indeed, happen? The spin_lock_irqsave() should be enough to
synchronize dwc3_gadget_pullup() and the interrupt handler.

> Hence, the reason if there was already a pending IRQ triggered, the
> dwc3_gadget_disable_irq() won't ensure the IRQ is handled.  We can do
> something like:
> if (!is_on)
> 	dwc3_gadget_disable_irq()
> synchronize_irq()
> spin_lock_irqsave()
> if(!is_on) {
> ...
>
> But the logic to only apply this on the pullup removal case is a little
> messy.  Also, from my understanding, the spin_lock_irqsave() will only
> disable the local CPU IRQs, but not the interrupt line on the GIC, which
> means other CPUs can handle it, unless we explicitly set the IRQ
> affinity to CPUX.

Yeah, the way I understand this can't really happen. But I'm open to
being educated. Maybe Alan can explain if this is really possibility?

>>> +		dwc3_stop_active_transfers(dwc);
>>> +		__dwc3_gadget_stop(dwc);
>>> +
>>> +		count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>>> +		count &= DWC3_GEVNTCOUNT_MASK;
>>> +		if (count > 0) {
>>> +			dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
>>> +			dwc->ev_buf->lpos = (dwc->ev_buf->lpos + count) %
>>> +						dwc->ev_buf->length;
>>> +		}
>> 
>> don't duplicate code. Add a patch before this extracting this into
>> helper and use it for both irq handler and gadget pullup.
>> 
>
> We actually removed this call in the IRQ handler, as if we ensure that
> the IRQ routine has fully complete and won't trigger anymore, then this
> sequence will handle clearing of the event count.

oh, makes sense :-)

-- 
balbi

Download attachment "signature.asc" of type "application/pgp-signature" (858 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ