lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02E7334B1630744CBDC55DA8586225837F885FFD@ORSMSX103.amr.corp.intel.com>
Date:   Mon, 20 May 2019 21:52:02 +0000
From:   "Yang, Fei" <fei.yang@...el.com>
To:     John Stultz <john.stultz@...aro.org>
CC:     Andrzej Pietrasiewicz <andrzej.p@...labora.com>,
        Felipe Balbi <balbi@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Chen Yu <chenyu56@...wei.com>,
        lkml <linux-kernel@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        Amit Pundir <amit.pundir@...aro.org>,
        "Marek Szyprowski" <m.szyprowski@...sung.com>,
        "kernel@...labora.com" <kernel@...labora.com>
Subject: RE: [REGRESSION] usb: gadget: f_fs: Allow scatter-gather buffers

>>>> One question that comes to my mind is this: Does the USB 
>>>> transmission stall (e.g. endpoint stall) or not? In other words, is 
>>>> adb connection broken because USB stops transmitting anything, or 
>>>> because the data is transmitted but its integrity is broken during 
>>>> transmission and that causes adb/adbd confusion which results in stopping their operation?
>>>> Does anything keep happening on FunctionFS when adb connection is 
>>>> broken?
>>>
>>>Any discoveries about the problem?
>>
>> In my debugging, I'm seeing a lot of requests queued up through 
>> ffs_epfile_io (returning -EIOCBQUEUED), but only a few of them came back through ffs_epfile_async_io_complete -> ffs_user_copy_worker.
>> I don’t think there is a USB transmission stall though, because if I 
>> manually disable io_data->use_sg, everything goes back to normal. So it looks more likely to be a buffer handling problem in the DWC3 driver.
>
> Yea, I also did reconfirm that reverting 772a7a724f6, or setting
> gadget->sg_supported to false makes the isssue go away.
>
> And after spending a bunch of time trying to trace through the code last week, in particular the sg_supported checks, but I'm not seeing anything that is standing out with the f_fs logic.
>
> I'd start to agree it might be a buffer handling problem in dwc3, but it feels odd that I'm also seeing this w/ dwc2 hardware as well. Maybe the  same bug was copied into both drivers?
>
> I'll try to dig a little on that theory today.

One of the problems appears to be that req->num_mapped_sgs was left uninitialized. I made the following change and got a lot more requests completed.
However this change is not sufficient to solve the adb issue, the usb requests would eventually get stuck without getting a matching ffs_epfile_async_io_complete. 

@@ -1067,6 +1067,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
                        req->buf = NULL;
                        req->sg = io_data->sgt.sgl;
                        req->num_sgs = io_data->sgt.nents;
+                       req->num_mapped_sgs = req->num_sgs;
                } else {
                        req->buf = data;
                }
@@ -1110,6 +1111,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
                        req->buf = NULL;
                        req->sg = io_data->sgt.sgl;
                        req->num_sgs = io_data->sgt.nents;
+                       req->num_mapped_sgs = req->num_sgs;
                } else {
                        req->buf = data;
                }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ