lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 Mar 2021 11:32:54 +0100
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Wesley Cheng <wcheng@...eaurora.org>,
        Andy Shevchenko <andy.shevchenko@...il.com>
Cc:     Felipe Balbi <balbi@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        USB <linux-usb@...r.kernel.org>
Subject: Re: [PATCH v3] usb: dwc3: gadget: Prevent EP queuing while stopping
 transfers

Hi Wesley,

On 23.03.2021 22:53, Wesley Cheng wrote:
> On 3/23/2021 10:27 AM, Andy Shevchenko wrote:
>> On Tue, Mar 23, 2021 at 1:19 AM Wesley Cheng <wcheng@...eaurora.org> wrote:
>>> On 3/22/2021 2:14 PM, Andy Shevchenko wrote:
>>>> On Mon, Mar 22, 2021 at 10:06 PM Wesley Cheng <wcheng@...eaurora.org> wrote:
>>>>> On 3/22/2021 12:34 PM, Andy Shevchenko wrote:
>>>>>> On Mon, Mar 22, 2021 at 8:49 PM Wesley Cheng <wcheng@...eaurora.org> wrote:
>>>>>>> On 3/22/2021 5:48 AM, Andy Shevchenko wrote:
>>>>>>>> On Fri, Mar 12, 2021 at 2:01 AM Wesley Cheng <wcheng@...eaurora.org> wrote:
>>>>>>>>> In the situations where the DWC3 gadget stops active transfers, once
>>>>>>>>> calling the dwc3_gadget_giveback(), there is a chance where a function
>>>>>>>>> driver can queue a new USB request in between the time where the dwc3
>>>>>>>>> lock has been released and re-aquired.  This occurs after we've already
>>>>>>>>> issued an ENDXFER command.  When the stop active transfers continues
>>>>>>>>> to remove USB requests from all dep lists, the newly added request will
>>>>>>>>> also be removed, while controller still has an active TRB for it.
>>>>>>>>> This can lead to the controller accessing an unmapped memory address.
>>>>>>>>>
>>>>>>>>> Fix this by ensuring parameters to prevent EP queuing are set before
>>>>>>>>> calling the stop active transfers API.
>>>>>>>>
>>>>>>>> commit f09ddcfcb8c569675066337adac2ac205113471f
>>>>>>>> Author: Wesley Cheng <wcheng@...eaurora.org>
>>>>>>>> Date:   Thu Mar 11 15:59:02 2021 -0800
>>>>>>>>
>>>>>>>>     usb: dwc3: gadget: Prevent EP queuing while stopping transfers
>>>>>>>>
>>>>>>>> effectively broke my gadget setup.
>>>>>>>>
>>>>>>>> The output of the kernel (followed by non responsive state of USB controller):
>>>>>>>>
>>>>>>>> [  195.228586] using random self ethernet address
>>>>>>>> [  195.233104] using random host ethernet address
>>>>>>>> [  195.245306] usb0: HOST MAC aa:bb:cc:dd:ee:f2
>>>>>>>> [  195.249732] usb0: MAC aa:bb:cc:dd:ee:f1
>>>>>>>> # [  195.773594] IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
>>>>>>>> [  195.780585] ------------[ cut here ]------------
>>>>>>>> [  195.785217] dwc3 dwc3.0.auto: No resource for ep2in
>>>>>>>> [  195.790162] WARNING: CPU: 0 PID: 217 at
>>>>>>>> drivers/usb/dwc3/gadget.c:360 dwc3_send_gadget_ep_cmd+0x4b9/0x670
>>>>>>>> [  195.799760] Modules linked in: usb_f_eem u_ether libcomposite
>>>>>>>> brcmfmac brcmutil mmc_block pwm_lpss_pci pwm_lps
>>>>>>>> s snd_sof_pci_intel_tng snd_sof_pci snd_sof_acpi_intel_byt
>>>>>>>> snd_sof_intel_ipc snd_sof_acpi snd_sof snd_sof_nocodec
>>>>>>>> spi_pxa2xx_platform snd_sof_xtensa_dsp spi_pxa2xx_pci
>>>>>>>> extcon_intel_mrfld intel_mrfld_adc sdhci_pci cqhci sdhci m
>>>>>>>> mc_core intel_mrfld_pwrbtn intel_soc_pmic_mrfld hci_uart btbcm btintel
>>>>>>>> [  195.835604] CPU: 0 PID: 217 Comm: irq/16-dwc3 Not tainted 5.12.0-rc4+ #60
>>>>>>>> [  195.842403] Hardware name: Intel Corporation Merrifield/BODEGA BAY,
>>>>>>>> BIOS 542 2015.01.21:18.19.48
>>>>>>>> [  195.851191] RIP: 0010:dwc3_send_gadget_ep_cmd+0x4b9/0x670
>>>>>>>> [  195.856608] Code: cd 00 00 00 44 89 44 24 20 48 89 4c 24 18 e8 ee
>>>>>>>> f7 e4 ff 48 8b 4c 24 18 4c 89 f2 48 c7 c7 b9
>>>>>>>> ed 4f a0 48 89 c6 e8 ef 24 43 00 <0f> 0b 41 be ea ff ff ff 44 8b 44 24
>>>>>>>> 20 e9 80 fc ff ff 41 83 fe 92
>>>>>>>> [  195.875381] RSP: 0000:ffffa53c00373ba8 EFLAGS: 00010086
>>>>>>>> [  195.880617] RAX: 0000000000000000 RBX: 0000000000001387 RCX: 00000000ffffdfff
>>>>>>>> [  195.887755] RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000000
>>>>>>>> [  195.894893] RBP: ffff9ce8c8f2b028 R08: ffffffffa0732288 R09: 0000000000009ffb
>>>>>>>> [  195.902034] R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000041006
>>>>>>>> [  195.909170] R13: ffffa53c00373c24 R14: ffff9ce8c11dadb0 R15: ffff9ce8c2861700
>>>>>>>> [  195.916310] FS:  0000000000000000(0000) GS:ffff9ce8fe200000(0000)
>>>>>>>> knlGS:0000000000000000
>>>>>>>> [  195.924409] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>> [  195.930161] CR2: 00000000f7f694a0 CR3: 0000000038e0c000 CR4: 00000000001006f0
>>>>>>>> [  195.937300] Call Trace:
>>>>>>>> [  195.939755]  __dwc3_gadget_ep_enable+0x2d4/0x4e0
>>>>>>>> [  195.944393]  ? dwc3_remove_requests.constprop.0+0x86/0x170
>>>>>>> Odd that this change would affect the USB enablment path, as they were
>>>>>>> focused on the pullup disable path.  Would you happen to have any
>>>>>>> downstream changes on top of v5.12-rc4 we could review to see if they
>>>>>>> are still required? (ie where is the dwc3_remove_requests() coming from
>>>>>>> during ep enable)
>>>>>> You may check my branch [1] on GH. Basically you may be interested in
>>>>>> the commit:
>>>>>> 0f86df1294ee7523060cc16eafaf4898c693eab0 REVERTME: usb: dwc3: gadget:
>>>>>> skip endpoints ep[18]{in,out}
>>>>>> Otherwise it's a clean v5.12-rc4 with a revert and another USB PHY
>>>>>> suspend fix (which also shouldn't affect this).
>>>>> Can you link your GH reference?
>>>> Oops, sorry.
>>>> Here we are:
>>>>
>>>> [1]: https://github.com/andy-shev/linux/tree/eds-acpi
>>>>
>>> Thanks, I took a look and even tried it on my device running 5.12-rc4,
>>> but wasn't able to see the same problem.  Could you help collect the
>>> ftrace after enabling the tracing KCONFIG and running the below sequence?
>>>
>>> 1.  Mount debugfs
>>> 2.  Set up tracing instance
>>>
>>> mkdir /sys/kernel/debug/tracing/instances/usb
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_complete_trb/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_ctrl_req/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_ep_dequeue/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_ep_queue/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_gadget_ep_cmd/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_gadget_ep_disable/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_gadget_ep_enable/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_gadget_giveback/enable
>>> echo 1 >
>>> /sys/kernel/debug/tracing/instances/usb/events/dwc3/dwc3_prepare_trb/enable
>>> echo 1 > /sys/kernel/debug/tracing/instances/usb/tracing_on
>>>
>>> 3.  Run your test, which should include:
>>>          - echo "" > /sys/kernel/config/usb_gadget/g1/UDC
>>>          - echo <UDC name> > /sys/kernel/config/usb_gadget/g1/UDC
>>>
>>> 4.  Collect the trace output:
>>> cat /sys/kernel/debug/tracing/instances/usb/trace
>> Here we are (I cherry-picked again reverted patch, other stays the same) [2].
>> On top I put a warning, so you may see timestamps.
>>
>> Dunno how long it will stay there, please confirm that you got it.
>>
>> [2]: https://paste.ubuntu.com/p/jNF565ypPp/
>>
> Hi Andy,
>
> Would you be able to give the below change a try?
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 80606b8..cd58bd5 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -791,10 +791,6 @@ static int __dwc3_gadget_ep_disable(struct dwc3_ep
> *dep)
>       reg &= ~DWC3_DALEPENA_EP(dep->number);
>       dwc3_writel(dwc->regs, DWC3_DALEPENA, reg);
>
> -    dep->stream_capable = false;
> -    dep->type = 0;
> -    dep->flags = 0;
> -
>       /* Clear out the ep descriptors for non-ep0 */
>       if (dep->number > 1) {
>           dep->endpoint.comp_desc = NULL;
> @@ -803,6 +799,10 @@ static int __dwc3_gadget_ep_disable(struct dwc3_ep
> *dep)
>
>       dwc3_remove_requests(dwc, dep);
>
> +    dep->stream_capable = false;
> +    dep->type = 0;
> +    dep->flags = 0;
> +
>       return 0;
>   }

Feel free to add my:

Tested-by: Marek Szyprowski <m.szyprowski@...sung.com>

The above change fixed the issue I was about to complain, introduced (or 
revealed?) by the $subject commit.

Before applying the above fix I've observed following warning during 
system power off:

------------[ cut here ]------------
dwc3 15400000.usb: No resource for ep2in
WARNING: CPU: 0 PID: 162 at drivers/usb/dwc3/gadget.c:361 
dwc3_send_gadget_ep_cmd+0x95c/0xae8
Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac 
brcmutil cfg80211 crct10dif_ce s3fwrn5_i2c s3fwrn5 nci nfc s5p_mfc 
hci_uart s5p_jpeg btqca btbcm exynos_gsc v4l2_mem2mem 
videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common 
bluetooth videodev mc panfrost gpu_sched ecdh_generic ecc rfkill 
ip_tables x_tables ipv6
CPU: 0 PID: 162 Comm: irq/151-dwc3 Not tainted 5.12.0-rc3+ #2713
Hardware name: Samsung TM2E board (DT)
pstate: 40000085 (nZcv daIf -PAN -UAO -TCO BTYPE=--)
pc : dwc3_send_gadget_ep_cmd+0x95c/0xae8
...
Call trace:
  dwc3_send_gadget_ep_cmd+0x95c/0xae8
  __dwc3_gadget_ep_enable+0x40c/0x578
  dwc3_gadget_ep_enable+0x5c/0xf0
  usb_ep_enable+0x40/0x1d0
  ecm_set_alt+0xa8/0x1b8
  composite_setup+0x8b8/0x1840
  dwc3_ep0_delegate_req+0x38/0x58
  dwc3_ep0_interrupt+0xd2c/0x1048
  dwc3_thread_interrupt+0xe7c/0x10f8
  irq_thread_fn+0x28/0x98
  irq_thread+0x17c/0x298
  kthread+0x134/0x160
  ret_from_fork+0x10/0x18
irq event stamp: 1220
hardirqs last  enabled at (1219): [<ffff800010f96eac>] 
_raw_spin_unlock_irq+0x3c/0x80
hardirqs last disabled at (1220): [<ffff800010f97694>] 
_raw_spin_lock_irqsave+0xb4/0x148
softirqs last  enabled at (924): [<ffff8000100104e8>] _stext+0x4e8/0x614
softirqs last disabled at (901): [<ffff800010095304>] irq_exit+0x19c/0x1a8
---[ end trace 84804a1a38a2c9f7 ]---

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ