lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220926210329.GI20022@pengutronix.de>
Date:   Mon, 26 Sep 2022 23:03:29 +0200
From:   Michael Grzeschik <mgr@...gutronix.de>
To:     Dan Vacura <w36195@...orola.com>
Cc:     linux-usb@...r.kernel.org,
        Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        Laurent Pinchart <laurent.pinchart@...asonboard.com>,
        Felipe Balbi <balbi@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/1] uvc gadget sg performance issues

On Mon, Sep 26, 2022 at 03:51:31PM -0500, Dan Vacura wrote:
>Hi Michael, thanks for the prompt reply!
>
>On Mon, Sep 26, 2022 at 10:15:41PM +0200, Michael Grzeschik wrote:
>> Hi Dan!
>>
>> On Mon, Sep 26, 2022 at 02:53:06PM -0500, Dan Vacura wrote:
>> >
>> > Hello uvc gadget developers,
>> >
>> > I'm working on a 5.15.41 based kernel on a qcom chipset with the dwc3
>> > controller and I'm encountering two problems related to the recent performance
>> > improvement changes:
>>
>> What's about that odd kernel number. UVC is under heavy development, if
>> you plan to work with this code, you should probably test top of tree.
>
>Yes, it's a bit behind and it looks like some of the initial work you
>did for scatter/gather got pulled into the 5.15 tree, but subsequent
>changes didn't. I don't have much control over the kernel versioning as
>we're part of the GKI Android initiative:
>https://source.android.com/docs/core/architecture/kernel/generic-kernel-image
>and we can only work off of what is provided, like this release line:
>https://android.googlesource.com/kernel/common/+log/refs/heads/android13-5.15
>
>Perhaps we can revert these changes for the 5.15 kernel (and other
>versions) they were not intended for?


This, or we can find out which other patches are intendet to be pulled
into stable, so we can overall improve.

Anyway if you can filter out, which patches are hurting, feel free
to suggest those for revert. I am not very interested in v5.15.

>> > https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-5-m.grzeschik@pengutronix.de/  and
>> > https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-6-m.grzeschik@pengutronix.de/
>> >
>> > If I revert these two changes, then I have much improved stability and a
>> > transmission problem I'm seeing is gone. Has there been any success from
>> > others on 5.15 with this uvc improvement and any recommendations for my
>> > current problems?  Those being:
>> >
>> > 1) a smmu panic, snippet here: 
>> >
>> >    <3>[  718.314900][  T803] arm-smmu 15000000.apps-smmu: Unhandled arm-smmu context fault from a600000.dwc3!
>> >    <3>[  718.314994][  T803] arm-smmu 15000000.apps-smmu: FAR    = 0x00000000efe60800
>> >    <3>[  718.315023][  T803] arm-smmu 15000000.apps-smmu: PAR    = 0x0000000000000000
>> >    <3>[  718.315048][  T803] arm-smmu 15000000.apps-smmu: FSR    = 0x40000402 [TF R SS ]
>> >    <3>[  718.315074][  T803] arm-smmu 15000000.apps-smmu: FSYNR0    = 0x5f0003
>> >    <3>[  718.315096][  T803] arm-smmu 15000000.apps-smmu: FSYNR1    = 0xaa02
>> >    <3>[  718.315117][  T803] arm-smmu 15000000.apps-smmu: context bank#    = 0x1b
>> >    <3>[  718.315141][  T803] arm-smmu 15000000.apps-smmu: TTBR0  = 0x001b0000c2a92000
>> >    <3>[  718.315165][  T803] arm-smmu 15000000.apps-smmu: TTBR1  = 0x001b000000000000
>> >    <3>[  718.315192][  T803] arm-smmu 15000000.apps-smmu: SCTLR  = 0x0a5f00e7 ACTLR  = 0x00000003
>> >    <3>[  718.315245][  T803] arm-smmu 15000000.apps-smmu: CBAR  = 0x0001f300
>> >    <3>[  718.315274][  T803] arm-smmu 15000000.apps-smmu: MAIR0   = 0xf404ff44 MAIR1   = 0x0000efe4
>> >    <3>[  718.315297][  T803] arm-smmu 15000000.apps-smmu: SID = 0x40
>> >    <3>[  718.315318][  T803] arm-smmu 15000000.apps-smmu: Client info: BID=0x5, PID=0xa, MID=0x2
>> >    <3>[  718.315377][  T803] arm-smmu 15000000.apps-smmu: soft iova-to-phys=0x0000000000000000
>> >
>> >    I can reduce this panic with the proposed patch, but it still happens until I
>> >    disable the "req->no_interrupt = 1" logic.
>> >
>> > 2) The frame is not fully transmitted in dwc3 with sg support enabled.
>> >
>> >    There seems to be a mapping limit I'm seeing where only the roughly first
>> >    70% of the total frame is sent. Interestingly, if I allocate a larger
>> >    size for the buffer upfront, in uvc_queue_setup(), like sizes[0] =
>> >    video->imagesize * 3. Then the issue rarely happens. For example, when I
>> >    do YUYV I see green, uninitialized data, at the bottom part of the
>> >    frame. If I do MJPG with smaller filled sizes, the transmission is fine.
>> >
>> >    +-------------------------+
>> >    |                         |
>> >    |                         |
>> >    |                         |
>> >    |      Good data          |
>> >    |                         |
>> >    |                         |
>> >    |                         |
>> >    +-------------------------+
>> >    |xxxxxxxxxxxxxxxxxxxxxxxxx|
>> >    |xxxx  Bad data  xxxxxxxxx|
>> >    |xxxxxxxxxxxxxxxxxxxxxxxxx|
>> >    +-------------------------+
>> >
>> >
>> > Appreciate any thoughts or feedback related to these issues.
>>
>> Anyway, this is probably due to the frames being given back to early to
>> the frameproducer. We have the following patches mainline now to fix this issue:
>>
>> aef11279888c00e1841a3533a35d279285af3a51 usb: gadget: uvc: improve sg exit condition
>> 9b969f93bcef9b3d9e92f1810e22bbd6c344a0e5 usb: gadget: uvc: giveback vb2 buffer on req complete
>
>Yes, I did grab those in addition to some other necessary changes, noted
>in my patch here:
>https://patchwork.kernel.org/project/linux-usb/patch/20220926195307.110121-2-w36195@motorola.com/
>I also pulled a lot of patches from the dwc3 to be almost at parity with
>top of tree for the core.c/h, ep0.c/h, and gadget.c/h files, but these
>issues persisted.
>
>Out of curiosity, have you tested these changes with dwc3 and if so,
>have you tried "usb: gadget: uvc: decrease the interrupt load to a
>quarter" with scatter/gather disabled? For me the crash occurs more
>often.

I remember seeing some issues when I switched from sg to memcopy
recently. This happened somehow inbetween working on those sg fixes and
was unexpected and happened without having some obvious code to break
things, since that already worked, when I implemented my first sg
series. But that memcopy issue somehow fell from my table. I will look
into that again for sure. I bet the dwc3 gadget driver got some late
patches that could have broken things here. So it is probably the best
to look into those dwc3 patches and bisect them.

Regards,
Michael

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ