[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <413dcf9f-25a5-61f5-159f-a75e7b1f1522@redhat.com>
Date: Mon, 13 May 2019 09:44:46 +0200
From: Hans de Goede <hdegoede@...hat.com>
To: Uenal Mutlu <um@...luit.com>, Jens Axboe <axboe@...nel.dk>,
Maxime Ripard <maxime.ripard@...tlin.com>,
Chen-Yu Tsai <wens@...e.org>, linux-ide@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc: linux-sunxi@...glegroups.com, linux-amarula@...rulasolutions.com,
Jagan Teki <jagan@...rulasolutions.com>,
Pablo Greco <pgreco@...tosproject.org>,
Mark Rutland <mark.rutland@....com>,
Oliver Schinagl <oliver@...inagl.nl>,
Linus Walleij <linus.walleij@...aro.org>,
FUKAUMI Naoki <naobsd@...il.com>,
Andre Przywara <andre.przywara@....com>,
Stefan Monnier <monnier@....umontreal.ca>
Subject: Re: [RFC PATCH v2 RESEND] drivers: ata: ahci_sunxi: Increased
SATA/AHCI DMA TX/RX FIFOs
Hi,
On 12-05-19 22:59, Uenal Mutlu wrote:
> Increasing the SATA/AHCI DMA TX/RX FIFOs (P0DMACR.TXTS and .RXTS, ie.
> TX_TRANSACTION_SIZE and RX_TRANSACTION_SIZE) from default 0x0 each
> to 0x3 each, gives a write performance boost of 120 MiB/s to 132 MiB/s
> from lame 36 MiB/s to 45 MiB/s previously.
> Read performance is about 200 MiB/s.
> [tested on SSD using dd bs=2K/4K/8K/12K/16K/24K/32K: peak-perf at 12K].
>
> Tested on the Banana Pi R1 (aka Lamobo R1) and Banana Pi M1 SBCs
> with Allwinner A20 32bit-SoCs (ARMv7-a / arm-linux-gnueabihf).
> These devices are RaspberryPi-like small devices.
>
> This problem of slow SATA write-speed with these small devices lasts now
> for more than 5 years. Many commentators throughout the years wrongly
> assumed the slow write speed was a hardware limitation. This patch finally
> solves the problem, which in fact was just a hard-to-fix software problem
> (b/c of lack of documentation by the SoC-maker Allwinner Technology).
>
> RFC: Since more than about 25 similar SBC/SoC models do use the
> ahci_sunxi driver, users are encouraged to test it on all the
> affected boards and give feedback
The SATA controller on these boards is inside the A10/A20 SoC, the
A10 and A20 use the same controller, so it is the same on all the boards.
IOW I don't see this only being tested on 1 board as a reason for the patch
to be RFC.
> Lists of the affected sunxi and other boards and SoCs with SATA using
> the ahci_sunxi driver:
> $ grep -i -e "^&ahci" arch/arm/boot/dts/sun*dts
> and http://linux-sunxi.org/SATA#Devices_with_SATA_ports
> See also http://linux-sunxi.org/Category:Devices_with_SATA_port
>
> Patch v2:
> - Commented the patch in-place in ahci_sunxi.c
> - With bs=12K and no conv=... passed to dd, the write performance
> rises further to 132 MiB/s
> - Changed MB/s to MiB/s
> - Posted the story behind the patch:
> http://lkml.iu.edu/hypermail/linux/kernel/1905.1/03506.html
> - Posted a dd test script to find optimal bs, and some results:
> https://bit.ly/2YoOzEM
>
> Patch v1:
> - States bs=4K for dd and a write performance of 120 MiB/s
>
> Signed-off-by: Uenal Mutlu <um@...luit.com>
> ---
> drivers/ata/ahci_sunxi.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/ata/ahci_sunxi.c b/drivers/ata/ahci_sunxi.c
> index 911710643305..ed19f19808c5 100644
> --- a/drivers/ata/ahci_sunxi.c
> +++ b/drivers/ata/ahci_sunxi.c
> @@ -157,8 +157,51 @@ static void ahci_sunxi_start_engine(struct ata_port *ap)
> void __iomem *port_mmio = ahci_port_base(ap);
> struct ahci_host_priv *hpriv = ap->host->private_data;
>
> - /* Setup DMA before DMA start */
> - sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ff00, 0x00004400);
> + /* Setup DMA before DMA start
> + *
> + * NOTE: A similar SoC with SATA/AHCI by Texas Instruments documents
> + * this Vendor Specific Port (P0DMACR, aka PxDMACR) in its
> + * User's Guide document (TMS320C674x/OMAP-L1x Processor
> + * Serial ATA (SATA) Controller, Literature Number: SPRUGJ8C,
> + * March 2011, Chapter 4.33 Port DMA Control Register (P0DMACR),
> + * p.68, https://www.ti.com/lit/ug/sprugj8c/sprugj8c.pdf)
> + * as equivalent to the following struct:
> + *
> + * struct AHCI_P0DMACR_t
> + * {
> + * unsigned TXTS : 4,
> + * RXTS : 4,
> + * TXABL : 4,
> + * RXABL : 4,
> + * Reserved : 16;
> + * };
> + *
> + * TXTS: Transmit Transaction Size (TX_TRANSACTION_SIZE).
> + * This field defines the DMA transaction size in DWORDs for
> + * transmit (system bus read, device write) operation. [...]
> + *
> + * RXTS: Receive Transaction Size (RX_TRANSACTION_SIZE).
> + * This field defines the Port DMA transaction size in DWORDs
> + * for receive (system bus write, device read) operation. [...]
> + *
> + * TXABL: Transmit Burst Limit.
> + * This field allows software to limit the VBUSP master read
> + * burst size. [...]
> + *
> + * RXABL: Receive Burst Limit.
> + * Allows software to limit the VBUSP master write burst
> + * size. [...]
> + *
> + * Reserved: Reserved.
> + *
> + *
> + * NOTE: According to the above document, the following alternative
> + * to the code below could perhaps be a better option
> + * (or preparation) for possible further improvements later:
> + * sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ffff,
> + * 0x00000033);
> + */
> + sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ffff, 0x00004433);
Have you tried / benchmarked the 0x00000033 option?
Regards,
Hans
Powered by blists - more mailing lists