lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d309a6f-39b2-43da-96a6-b7c59b98431e@samsung.com>
Date: Thu, 4 Dec 2025 18:45:49 +0530
From: Selvarasu Ganesan <selvarasu.g@...sung.com>
To: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
Cc: Alan Stern <stern@...land.harvard.edu>, "gregkh@...uxfoundation.org"
	<gregkh@...uxfoundation.org>, "linux-usb@...r.kernel.org"
	<linux-usb@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "jh0801.jung@...sung.com"
	<jh0801.jung@...sung.com>, "dh10.jung@...sung.com" <dh10.jung@...sung.com>,
	"naushad@...sung.com" <naushad@...sung.com>, "akash.m5@...sung.com"
	<akash.m5@...sung.com>, "h10.kim@...sung.com" <h10.kim@...sung.com>,
	"eomji.oh@...sung.com" <eomji.oh@...sung.com>, "alim.akhtar@...sung.com"
	<alim.akhtar@...sung.com>, "thiagu.r@...sung.com" <thiagu.r@...sung.com>,
	"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v2] usb: dwc3: gadget: Prevent EPs resource conflict
 during StartTransfer


On 12/4/2025 7:21 AM, Thinh Nguyen wrote:
> On Wed, Dec 03, 2025, Selvarasu Ganesan wrote:
>> On 11/21/2025 8:38 AM, Alan Stern wrote:
>>> On Fri, Nov 21, 2025 at 02:22:02AM +0000, Thinh Nguyen wrote:
>>>> On Wed, Nov 19, 2025, Alan Stern wrote:
>>>>> ->set_alt() is called by the composite core when a Set-Interface or
>>>>> Set-Config control request arrives from the host.  It happens within the
>>>>> composite_setup() handler, which is called by the UDC driver when a
>>>>> control request arrives, which means it happens in the context of the
>>>>> UDC driver's interrupt handler.  Therefore ->set_alt() callbacks must
>>>>> not sleep.
>>>> This should be changed. I don't think we can expect set_alt() to
>>>> be in interrupt context only.
>>> Agreed.
>>>
>>>>> To do this right, I can't think of any approach other than to make the
>>>>> composite core use a work queue or other kernel thread for handling
>>>>> Set-Interface and Set-Config calls.
>>>> Sounds like it should've been like this initially.
>>> I guess the nobody thought through the issues very carefully at the time
>>> the composite framework was designed.  Maybe the UDCs that existed back
>>> did not require a lot of time to flush endpoints; I can't remember.
>>>
>>>>> Without that ability, we will have to audit every function driver to
>>>>> make sure the ->set_alt() callbacks do ensure that endpoints are flushed
>>>>> before they are re-enabled.
>>>>>
>>>>> There does not seem to be any way to fix the problem just by changing
>>>>> the gadget core.
>>>>>
>>>> We can have a workaround in dwc3 that can temporarily "work" with what
>>>> we have. However, eventually, we will need to properly rework this and
>>>> audit the gadget drivers.
>>> Clearly, the first step is to change the composite core.  That can be
>>> done without messing up anything else.  But yes, eventually the gadget
>>> drivers will have to be audited.
>>>
>>> Alan Stern
>>
>> Hi Thinh,
>>
>> Do you have any suggestions that might be helpful for us to try on our side?
>> This EP resource‑conflict problem becomes easily observable when the
>> RNDIS network test executing ifconfig rndis0 down/up is run repeatedly
>> on the device side.
>>
>> Thanks,
>> Selva
> At the moment, I can't think of a way to workaround for all cases. Let's
> just leave bulk streams alone for now. Until we have proper fixes to the
> gadget framework, let's just try the below.
>
> Thanks,
> Thinh
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 3830aa2c10a9..974573304441 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -960,11 +960,18 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
>   	}
>   
>   	/*
> -	 * Issue StartTransfer here with no-op TRB so we can always rely on No
> -	 * Response Update Transfer command.
> +	 * For streams, at start, there maybe a race where the
> +	 * host primes the endpoint before the function driver
> +	 * queues a request to initiate a stream. In that case,
> +	 * the controller will not see the prime to generate the
> +	 * ERDY and start stream. To workaround this, issue a
> +	 * no-op TRB as normal, but end it immediately. As a
> +	 * result, when the function driver queues the request,
> +	 * the next START_TRANSFER command will cause the
> +	 * controller to generate an ERDY to initiate the
> +	 * stream.
>   	 */
> -	if (usb_endpoint_xfer_bulk(desc) ||
> -			usb_endpoint_xfer_int(desc)) {
> +	if (dep->stream_capable) {
>   		struct dwc3_gadget_ep_cmd_params params;
>   		struct dwc3_trb	*trb;
>   		dma_addr_t trb_dma;
> @@ -983,35 +990,21 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
>   		if (ret < 0)
>   			return ret;
>   
> -		if (dep->stream_capable) {
> -			/*
> -			 * For streams, at start, there maybe a race where the
> -			 * host primes the endpoint before the function driver
> -			 * queues a request to initiate a stream. In that case,
> -			 * the controller will not see the prime to generate the
> -			 * ERDY and start stream. To workaround this, issue a
> -			 * no-op TRB as normal, but end it immediately. As a
> -			 * result, when the function driver queues the request,
> -			 * the next START_TRANSFER command will cause the
> -			 * controller to generate an ERDY to initiate the
> -			 * stream.
> -			 */
> -			dwc3_stop_active_transfer(dep, true, true);
> +		dwc3_stop_active_transfer(dep, true, true);
>   
> -			/*
> -			 * All stream eps will reinitiate stream on NoStream
> -			 * rejection.
> -			 *
> -			 * However, if the controller is capable of
> -			 * TXF_FLUSH_BYPASS, then IN direction endpoints will
> -			 * automatically restart the stream without the driver
> -			 * initiation.
> -			 */
> -			if (!dep->direction ||
> -			    !(dwc->hwparams.hwparams9 &
> -			      DWC3_GHWPARAMS9_DEV_TXF_FLUSH_BYPASS))
> -				dep->flags |= DWC3_EP_FORCE_RESTART_STREAM;
> -		}
> +		/*
> +		 * All stream eps will reinitiate stream on NoStream
> +		 * rejection.
> +		 *
> +		 * However, if the controller is capable of
> +		 * TXF_FLUSH_BYPASS, then IN direction endpoints will
> +		 * automatically restart the stream without the driver
> +		 * initiation.
> +		 */
> +		if (!dep->direction ||
> +		    !(dwc->hwparams.hwparams9 &
> +		      DWC3_GHWPARAMS9_DEV_TXF_FLUSH_BYPASS))
> +			dep->flags |= DWC3_EP_FORCE_RESTART_STREAM;
>   	}
>   
>   out:


Hi Thinh,

Thanks for the changes. We understand the given fix and have verified 
that the original issue is resolved, but a similar below warning appears 
again in `dwc3_gadget_ep_queue` when we run a long duration our test. 
And we confirmed this is not due to this new given changes.

This warning is caused by a race between `dwc3_gadget_ep_disable` and 
`dwc3_gadget_ep_queue` that manipulates `dep->flags`.

Please refer the below sequence for the reference.

The warning originates from a race condition between 
dwc3_gadget_ep_disable and dwc3_send_gadget_ep_cmd from 
dwc3_gadget_ep_queue that both manipulate dep->flags. Proper 
synchronization or a check is needed when masking (dep->flags &= mask) 
inside dwc3_gadget_ep_disable.


Thread#1:
usb_ep_queue
  ->dwc3_gadget_ep_queue
   ->__dwc3_gadget_kick_transfer
    -> starting = !(dep->flags & DWC3_EP_TRANSFER_STARTED);
     ->if(starting)
        ->dwc3_send_gadget_ep_cmd (cmd = DWC3_DEPCMD_STARTTRANSFER)
          ->dep->flags |= DWC3_EP_TRANSFER_STARTED;


Thread#2:
dwc3_gadget_ep_disable
  ->__dwc3_gadget_ep_disable
   ->dwc3_remove_requests
    ->dwc3_stop_active_transfer
     ->__dwc3_stop_active_transfer
      -> dwc3_send_gadget_ep_cmd (cmd =DWC3_DEPCMD_ENDTRANSFER)
       ->if(!interrupt)dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
           ...
           While Thread#2 is still running, Thread #1 starts again:

Thread#1:
usb_ep_queue
  ->dwc3_gadget_ep_queue
   ->__dwc3_gadget_kick_transfer
    -> starting = !(dep->flags & DWC3_EP_TRANSFER_STARTED);
     ->if(starting)
       ->dwc3_send_gadget_ep_cmd (cmd = DWC3_DEPCMD_STARTTRANSFER)
        ->dep->flags |= DWC3_EP_TRANSFER_STARTED;
          ...
###### Thread#2: Continuation (race)
           ->__dwc3_gadget_ep_disable
            ->mask = DWC3_EP_TXFIFO_RESIZED | DWC3_EP_RESOURCE_ALLOCATED;
             ->dep->flags &= mask; --> // Possible of clears
                 DWC3_EP_TRANSFER_STARTED flag as well without
                 sending DWC3_DEPCMD_ENDTRANSFER

Thread#1:(Failure due to race)
usb_ep_queue
  ->dwc3_gadget_ep_queue
   ->__dwc3_gadget_kick_transfer
    -> starting = !(dep->flags & DWC3_EP_TRANSFER_STARTED);
      ->if(starting)
        ->dwc3_send_gadget_ep_cmd (cmd = DWC3_DEPCMD_STARTTRANSFER) ->
                 Results in “No resource” allocation error because the
                 previous transfer was never end with ENDTRANSFER.


     ------------[ cut here ]------------
  dwc3 13200000.dwc3: No resource for ep1in
  WARNING: CPU: 7 PID: 1748 at drivers/usb/dwc3/gadget.c:398 
dwc3_send_gadget_ep_cmd+0x2f8/0x76c
  pc : dwc3_send_gadget_ep_cmd+0x2f8/0x76c
  lr : dwc3_send_gadget_ep_cmd+0x2f8/0x76c
  Call trace:
  dwc3_send_gadget_ep_cmd+0x2f8/0x76c
  __dwc3_gadget_kick_transfer+0x2ec/0x3f4
  dwc3_gadget_ep_queue+0x140/0x1f0
  usb_ep_queue+0x60/0xec
  mp_tx_task+0x100/0x134
  mp_tx_timeout+0xd0/0x1e0
  __hrtimer_run_queues+0x130/0x318
  hrtimer_interrupt+0xe8/0x340
  exynos_mct_comp_isr+0x58/0x80
  __handle_irq_event_percpu+0xcc/0x25c
  handle_irq_event+0x40/0x9c
  handle_fasteoi_irq+0x154/0x2c8
  generic_handle_domain_irq+0x58/0x80
  gic_handle_irq+0x48/0x104
  call_on_irq_stack+0x3c/0x50
  do_interrupt_handler+0x4c/0x84
  el1_interrupt+0x34/0x58
  el1h_64_irq_handler+0x18/0x24
  el1h_64_irq+0x68/0x6c
  _raw_spin_unlock_irqrestore+0xc/0x4c
  binder_wakeup_thread_ilocked+0x50/0xb4
  binder_proc_transaction+0x308/0x6ec
  binder_transaction+0x1aec/0x23b0
  binder_ioctl_write_read+0xa28/0x2534
  binder_ioctl+0x1fc/0xb3c
  __arm64_sys_ioctl+0xa8/0xe4
  invoke_syscall+0x58/0x10c
  el0_svc_common+0x80/0xdc
  do_el0_svc+0x1c/0x28
  el0_svc+0x38/0x88
  el0t_64_sync_handler+0x70/0xbc
  el0t_64_sync+0x1a8/0x1ac
  ---[ end trace 0000000000000000 ]---

Thanks,
Selva

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ