lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220218044444.GA48532@ubuntu>
Date:   Fri, 18 Feb 2022 13:44:44 +0900
From:   Jung Daehwan <dh10.jung@...sung.com>
To:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>
Cc:     Felipe Balbi <balbi@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        "quic_wcheng@...cinc.com" <quic_wcheng@...cinc.com>,
        "quic_jackp@...cinc.com" <quic_jackp@...cinc.com>
Subject: Re: [PATCH v2 1/2] usb: dwc3: Not set DWC3_EP_END_TRANSFER_PENDING
 in ep cmd fails

On Tue, Feb 15, 2022 at 05:14:50PM +0000, Thinh Nguyen wrote:
> Jung Daehwan wrote:
> > Hi Thinh,
> > 
> > On Mon, Feb 14, 2022 at 06:44:33PM +0000, Thinh Nguyen wrote:
> >> Hi,
> >>
> >> Daehwan Jung wrote:
> >>> It always sets DWC3_EP_END_TRANSFER_PENDING in dwc3_stop_active_transfer
> >>> even if dwc3_send_gadget_ep_cmd fails. It can cause some problems like
> >>
> >> How does it fail? Timed out?
> > 
> > Yes, timed out.
> >>
> >>> skipping clear stall commmand or giveback from dequeue. We fix to set it
> >>> only when ep cmd success. Additionally, We clear DWC3_EP_TRANSFER_STARTED
> >>> for next trb to start transfer not update transfer.
> >>
> >> We shouldn't do this. Things will be out of sync. It may work for 1
> >> scenario, but it won't work for others.
> >>
> >> Please help me understand a few things:
> >>
> >> 1) What is the scenario that triggers this? Is it random?
> >>
> > ep cmd timeout occurs on dequeue request from user side. End Transfer command
> > would be sent in dwc3_stop_active transfer.
> 
> At the high level, what's triggering the request dequeue? Is it from a
> disconnect, change of interface, or simply function driver protocol that
> changes it.
> 
> What application was used to trigger this?
> 
Removing function driver requests dequeue.
> > 
> >> 2) Are there other traffics pending while issuing the End Transfer
> >> command? If so, what transfer type(s)?
> >>
> > I haven't checked it yet. 
> 
> Can you check?
> 
It seems there's no pending traffic.
> > 
> >> 3) Have you tried increasing the timeout?
> >>
> > No, I haven't.
> 
> Can you try up to 10 seconds (just for experiment)
> 
Yes, it's reproduced on 10 secs.
> >> BR,
> >> Thinh
> >>
> > 
> > This issue occurs very rarely on customer. I only have restricted
> > information. That's why I've been trying to reproduce it.
> 
> How did you test your fix if you can't reproduce it?
> 

I sent patches to my client and he said this issue would not occur after
applying them. In fact, They are not totally equal with those we are
discussing. I've refactor-ed and pushed them here.

> > 
> > Wesley may have run into same issue and you can see this issue in detail.
> > https://protect2.fireeye.com/v1/url?k=39bd3e50-66260725-39bcb51f-0cc47a31ce4e-bf000d5a697ead66&q=1&e=b04500ad-43d0-4ee7-b5f0-a7283d8fc0da&u=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fprotect2.fireeye.com%2Fv1%2Furl%3Fk%3D9d423b69-fc3fd32e-9d43b026-74fe485fff30-77a099b52659410d%26q%3D1%26e%3D20b4d9f5-2599-4f57-8b6a-7c4ec167d228%26u%3Dhttps%2A3A%2A2F%2A2Flore.kernel.org%2A2Flinux-usb%2A2F20220203080017.27339-1-quic_wcheng%2A40quicinc.com%2A2F__%3BJSUlJSUlJQ%21%21A4F2R9G_pg%21JWPzNLoO3BFX_IZCVzmHPtxq6frr_VFbSNNaxSQylunt1Y4EauTOefth2LCIcVEuTx8E%24 
> > 
> 
> I can take a look, but please provide the tracepoints of the failure if
> you can reproduce it.
> 

Thanks to client's help, I could reproduce it in my envrionment.

1. Trace

MtpServer-2484    [003] d..2   114.376829: dwc3_ep_queue: ep1out: req 0000000055561867 length 0/512 zsI ==> -115
MtpServer-2484    [003] d..2   114.376838: dwc3_prepare_trb: ep1out: trb 0000000074078a95 (E23:D22) buf 0000000954845600 size 512 ctrl 0000001d (HlCS:sc:normal)
MtpServer-2484    [003] d..2   114.376844: dwc3_prepare_trb: ep1out: trb 0000000095f503e7 (E24:D22) buf 00000000e1303000 size 512 ctrl 00000819 (HlcS:sC:normal)
MtpServer-2484    [003] d..2   114.376867: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful
ksoftirqd/2-23      [002] d.s2   190.428752: dwc3_ep_dequeue: ep2out: req 00000000732c0f69 length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns3   190.447625: dwc3_gadget_ep_cmd: ep2out: cmd 'End Transfer' [40d08] params 00000000 00000000 00000000 --> status: Timed Out
ksoftirqd/2-23      [002] dns2   190.450596: dwc3_ep_dequeue: ep2out: req 00000000be3434d4 length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450618: dwc3_ep_dequeue: ep2out: req 0000000033655f2e length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450625: dwc3_ep_dequeue: ep2out: req 00000000dc0ff635 length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450633: dwc3_ep_dequeue: ep2out: req 0000000023ffb8e8 length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450641: dwc3_ep_dequeue: ep2out: req 000000005fab61aa length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450648: dwc3_ep_dequeue: ep2out: req 000000001f618cd8 length 0/16384 zsI ==> -115
ksoftirqd/2-23      [002] dns2   190.450654: dwc3_ep_dequeue: ep2out: req 00000000da0ea1b1 length 0/16384 zsI ==> -115
##### CPU 6 buffer started ####
HwBinder:636_1-658     [006] d..1   190.735324: dwc3_gadget_ep_disable: ep0out: mps 512/512 streams 0 burst 1 ring 0/0 flags e:swbp:>
HwBinder:636_1-658     [006] d..1   190.735327: dwc3_gadget_ep_disable: ep0in: mps 512/512 streams 0 burst 1 ring 0/0 flags e:swbp:<
HwBinder:636_1-658     [003] dn.1   214.206652: dwc3_gadget_ep_disable: ep0out: mps 512/512 streams 0 burst 1 ring 0/0 flags e:swbp:>
HwBinder:636_1-658     [003] dn.1   214.206677: dwc3_gadget_ep_disable: ep0in: mps 512/512 streams 0 burst 1 ring 0/0 flags e:swbp:<

2. Kernel log

[  190.447711]I[2:    ksoftirqd/2:   23] ------------[ cut here ]------------
[  190.447777]I[2:    ksoftirqd/2:   23] WARNING: CPU: 2 PID: 23 at drivers/usb/dwc3/gadget.c:3632 dwc3_gadget_ep_dequeue+0x228/0x234
[  190.449587]I[2:    ksoftirqd/2:   23] pc : dwc3_gadget_ep_dequeue+0x228/0x234
[  190.449610]I[2:    ksoftirqd/2:   23] lr : dwc3_gadget_ep_dequeue+0x124/0x234
[  190.449629]I[2:    ksoftirqd/2:   23] sp : ffffffc01294bad0
[  190.449647]I[2:    ksoftirqd/2:   23] x29: ffffffc01294baf0 x28: 0000000000000000 
[  190.449679]I[2:    ksoftirqd/2:   23] x27: ffffffc0121e4000 x26: ffffff88208c9dc0 
[  190.449711]I[2:    ksoftirqd/2:   23] x25: ffffff893dd5c570 x24: ffffff893dd22080 
[  190.449742]I[2:    ksoftirqd/2:   23] x23: ffffff88a4f8a168 x22: ffffff88a4f8a100 
[  190.449773]I[2:    ksoftirqd/2:   23] x21: ffffff893dd5c500 x20: 0000000000000080 
[  190.449803]I[2:    ksoftirqd/2:   23] x19: ffffff893dd22238 x18: ffffffc01291d068 
[  190.449834]I[2:    ksoftirqd/2:   23] x17: 0000000000004000 x16: ffffff88bb0cc000 
[  190.449864]I[2:    ksoftirqd/2:   23] x15: 0000000000000200 x14: ffffff88208d4d00 
[  190.449895]I[2:    ksoftirqd/2:   23] x13: ffffff88208d4d00 x12: 00000000000001b5 
[  190.449926]I[2:    ksoftirqd/2:   23] x11: ffffff88208c9dc0 x10: 0000000100000103 
[  190.449958]I[2:    ksoftirqd/2:   23] x9 : ffffff88208c9dc0 x8 : 0000000000000008 
[  190.449989]I[2:    ksoftirqd/2:   23] x7 : fefefefefefefefe x6 : 0000000000008080 
[  190.450020]I[2:    ksoftirqd/2:   23] x5 : 0000000000000000 x4 : 0000000000000103 
[  190.450050]I[2:    ksoftirqd/2:   23] x3 : 0000000000000080 x2 : ffffff88208ac900 
[  190.450081]I[2:    ksoftirqd/2:   23] x1 : ffffff88208ac900 x0 : 00000000ffffff92 
[  190.450120]I[2:    ksoftirqd/2:   23] Call trace:
[  190.450153]I[2:    ksoftirqd/2:   23]  dwc3_gadget_ep_dequeue+0x228/0x234
[  190.450187]I[2:    ksoftirqd/2:   23]  usb_ep_dequeue+0x4c/0xa0
[  190.450219]I[2:    ksoftirqd/2:   23]  ffs_aio_cancel+0x4c/0x84
[  190.450247]I[2:    ksoftirqd/2:   23]  free_ioctx_users+0x80/0x110
[  190.450281]I[2:    ksoftirqd/2:   23]  percpu_ref_put_many+0xf0/0x164
[  190.450310]I[2:    ksoftirqd/2:   23]  percpu_ref_switch_to_atomic_rcu+0x13c/0x1dc
[  190.450339]I[2:    ksoftirqd/2:   23]  rcu_do_batch+0x208/0x5b4
[  190.450360]I[2:    ksoftirqd/2:   23]  rcu_core+0x22c/0x43c
[  190.450384]I[2:    ksoftirqd/2:   23]  rcu_core_si+0x14/0x24
[  190.450409]I[2:    ksoftirqd/2:   23]  efi_header_end+0x134/0x350
[  190.450437]I[2:    ksoftirqd/2:   23]  run_ksoftirqd+0x34/0x74

The point is there would be some requests pending without giveback-ed.
They're from removing function driver(ffs_aio_cancel). It could cause
the function driver would wait for dequeue requests completed.

Best Regards,
Jung Daehwan

> Thanks,
> Thinh


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ