[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e01c435-9ecc-4e3b-f55c-612a86667020@codeaurora.org>
Date: Thu, 17 Jun 2021 15:25:23 -0700
From: Wesley Cheng <wcheng@...eaurora.org>
To: Ferry Toth <fntoth@...il.com>, balbi@...nel.org,
gregkh@...uxfoundation.org, robh+dt@...nel.org, agross@...nel.org,
bjorn.andersson@...aro.org, frowand.list@...il.com
Cc: linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
devicetree@...r.kernel.org, linux-arm-msm@...r.kernel.org,
jackp@...eaurora.org, heikki.krogerus@...ux.intel.com,
andy.shevchenko@...il.com
Subject: Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP
bursting
Hi,
On 6/17/2021 2:55 PM, Ferry Toth wrote:
> Hi
>
> Op 17-06-2021 om 23:48 schreef Wesley Cheng:
>> Hi,
>>
>> On 6/17/2021 2:01 PM, Ferry Toth wrote:
>>> Hi
>>>
>>> Op 17-06-2021 om 11:58 schreef Wesley Cheng:
>>>> Changes in V10:
>>>> - Fixed compilation errors in config where OF is not used (error due to
>>>> unknown symbol for of_add_property()). Add of_add_property() stub.
>>>> - Fixed compilation warning for incorrect argument being passed to
>>>> dwc3_mdwidth
>>> This fixes the OOPS I had in V9. I do not see any change in performance
>>> on Merrifield though.
>> I see...thanks Ferry! With your testing, are you writing to the device's
>> internal storage (ie UFS, eMMC, etc...), or did you use a ramdisk as well?
>
> In this case I just tested the EEM path using iperf3.
>
Got it. I don't believe f_eem will use a high enough (if at all)
bMaxBurst value to change the TXFIFO size.
>> If not with a ramdisk, we might want to give that a try to avoid the
>> storage path being the bottleneck. You can use "dd" to create an empty
>> file, and then just use that as the LUN's backing file.
>>
>> echo ramdisk.img >
>> /sys/kernel/config/usb_gadget/g1/functions/mass_storage.0/lun.0/file
>
> Ah, why didn't I think of that. I have currently mass storage setup with
> eMMC but it seems that is indeed the bottleneck.
>
:), not a problem...I've been working on getting the ideal set up for
the performance profiling for awhile, so anything I can do to make sure
we get some good results.
> I'll try with a ramdisk and let you know.
>
Thanks again for the testing, Ferry.
Thanks
Wesley Cheng
>> Thanks
>> Wesley Cheng
>>
>>>> Changes in V9:
>>>> - Fixed incorrect patch in series. Removed changes in DTSI, as
>>>> dwc3-qcom will
>>>> add the property by default from the kernel.
>>>>
>>>> Changes in V8:
>>>> - Rebased to usb-testing
>>>> - Using devm_kzalloc for adding txfifo property in dwc3-qcom
>>>> - Removed DWC3 QCOM ACPI property for enabling the txfifo resize
>>>>
>>>> Changes in V7:
>>>> - Added a new property tx-fifo-max-num for limiting how much fifo
>>>> space the
>>>> resizing logic can allocate for endpoints with large burst
>>>> values. This
>>>> can differ across platforms, and tie in closely with overall
>>>> system latency.
>>>> - Added recommended checks for DWC32.
>>>> - Added changes to set the tx-fifo-resize property from dwc3-qcom by
>>>> default
>>>> instead of modifying the current DTSI files.
>>>> - Added comments on all APIs/variables introduced.
>>>> - Updated the DWC3 YAML to include a better description of the
>>>> tx-fifo-resize
>>>> property and added an entry for tx-fifo-max-num.
>>>>
>>>> Changes in V6:
>>>> - Rebased patches to usb-testing.
>>>> - Renamed to PATCH series instead of RFC.
>>>> - Checking for fs_descriptors instead of ss_descriptors for
>>>> determining the
>>>> endpoint count for a particular configuration.
>>>> - Re-ordered patch series to fix patch dependencies.
>>>>
>>>> Changes in V5:
>>>> - Added check_config() logic, which is used to communicate the
>>>> number of EPs
>>>> used in a particular configuration. Based on this, the DWC3
>>>> gadget driver
>>>> has the ability to know the maximum number of eps utilized in all
>>>> configs.
>>>> This helps reduce unnecessary allocation to unused eps, and will
>>>> catch fifo
>>>> allocation issues at bind() time.
>>>> - Fixed variable declaration to single line per variable, and
>>>> reverse xmas.
>>>> - Created a helper for fifo clearing, which is used by ep0.c
>>>>
>>>> Changes in V4:
>>>> - Removed struct dwc3* as an argument for dwc3_gadget_resize_tx_fifos()
>>>> - Removed WARN_ON(1) in case we run out of fifo space
>>>> Changes in V3:
>>>> - Removed "Reviewed-by" tags
>>>> - Renamed series back to RFC
>>>> - Modified logic to ensure that fifo_size is reset if we pass the
>>>> minimum
>>>> threshold. Tested with binding multiple FDs requesting 6 FIFOs.
>>>>
>>>> Changes in V2:
>>>> - Modified TXFIFO resizing logic to ensure that each EP is reserved a
>>>> FIFO.
>>>> - Removed dev_dbg() prints and fixed typos from patches
>>>> - Added some more description on the dt-bindings commit message
>>>>
>>>> Currently, there is no functionality to allow for resizing the
>>>> TXFIFOs, and
>>>> relying on the HW default setting for the TXFIFO depth. In most
>>>> cases, the
>>>> HW default is probably sufficient, but for USB compositions that contain
>>>> multiple functions that require EP bursting, the default settings
>>>> might not be enough. Also to note, the current SW will assign an EP to a
>>>> function driver w/o checking to see if the TXFIFO size for that
>>>> particular
>>>> EP is large enough. (this is a problem if there are multiple HW defined
>>>> values for the TXFIFO size)
>>>>
>>>> It is mentioned in the SNPS databook that a minimum of TX FIFO depth = 3
>>>> is required for an EP that supports bursting. Otherwise, there may be
>>>> frequent occurences of bursts ending. For high bandwidth functions,
>>>> such as data tethering (protocols that support data aggregation), mass
>>>> storage, and media transfer protocol (over FFS), the bMaxBurst value
>>>> can be
>>>> large, and a bigger TXFIFO depth may prove to be beneficial in terms
>>>> of USB
>>>> throughput. (which can be associated to system access latency,
>>>> etc...) It
>>>> allows for a more consistent burst of traffic, w/o any interruptions, as
>>>> data is readily available in the FIFO.
>>>>
>>>> With testing done using the mass storage function driver, the results
>>>> show
>>>> that with a larger TXFIFO depth, the bandwidth increased significantly.
>>>>
>>>> Test Parameters:
>>>> - Platform: Qualcomm SM8150
>>>> - bMaxBurst = 6
>>>> - USB req size = 256kB
>>>> - Num of USB reqs = 16
>>>> - USB Speed = Super-Speed
>>>> - Function Driver: Mass Storage (w/ ramdisk)
>>>> - Test Application: CrystalDiskMark
>>>>
>>>> Results:
>>>>
>>>> TXFIFO Depth = 3 max packets
>>>>
>>>> Test Case | Data Size | AVG tput (in MB/s)
>>>> -------------------------------------------
>>>> Sequential|1 GB x |
>>>> Read |9 loops | 193.60
>>>> | | 195.86
>>>> | | 184.77
>>>> | | 193.60
>>>> -------------------------------------------
>>>>
>>>> TXFIFO Depth = 6 max packets
>>>>
>>>> Test Case | Data Size | AVG tput (in MB/s)
>>>> -------------------------------------------
>>>> Sequential|1 GB x |
>>>> Read |9 loops | 287.35
>>>> | | 304.94
>>>> | | 289.64
>>>> | | 293.61
>>>> -------------------------------------------
>>>>
>>>> Wesley Cheng (6):
>>>> usb: gadget: udc: core: Introduce check_config to verify USB
>>>> configuration
>>>> usb: gadget: configfs: Check USB configuration before adding
>>>> usb: dwc3: Resize TX FIFOs to meet EP bursting requirements
>>>> of: Add stub for of_add_property()
>>>> usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default
>>>> dt-bindings: usb: dwc3: Update dwc3 TX fifo properties
>>>>
>>>> .../devicetree/bindings/usb/snps,dwc3.yaml | 15 +-
>>>> drivers/usb/dwc3/core.c | 9 +
>>>> drivers/usb/dwc3/core.h | 15 ++
>>>> drivers/usb/dwc3/dwc3-qcom.c | 9 +
>>>> drivers/usb/dwc3/ep0.c | 2 +
>>>> drivers/usb/dwc3/gadget.c | 212
>>>> +++++++++++++++++++++
>>>> drivers/usb/gadget/configfs.c | 22 +++
>>>> drivers/usb/gadget/udc/core.c | 25 +++
>>>> include/linux/of.h | 5 +
>>>> include/linux/usb/gadget.h | 5 +
>>>> 10 files changed, 317 insertions(+), 2 deletions(-)
>>>>
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Powered by blists - more mailing lists