lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 13 Jan 2021 16:13:06 +0530
From:   Vinod Koul <vkoul@...nel.org>
To:     Péter Ujfalusi <peter.ujfalusi@...il.com>
Cc:     dan.j.williams@...el.com, linux-kernel@...r.kernel.org,
        dmaengine@...r.kernel.org, vigneshr@...com,
        grygorii.strashko@...com, kishon@...com
Subject: Re: [PATCH 2/2] dmaengine: ti: k3-udma: Add support for burst_size
 configuration for mem2mem

On 13-01-21, 09:39, Péter Ujfalusi wrote:
> Hi Vinod,
> 
> On 1/12/21 12:16 PM, Vinod Koul wrote:
> > On 14-12-20, 10:13, Peter Ujfalusi wrote:
> >> The UDMA and BCDMA can provide higher throughput if the burst_size of the
> >> channel is changed from it's default (which is 64 bytes) for Ultra-high
> >> and high capacity channels.
> >>
> >> This performance benefit is even more visible when the buffers are aligned
> >> with the burst_size configuration.
> >>
> >> The am654 does not have a way to change the burst size, but it is using
> >> 64 bytes burst, so increasing the copy_align from 8 bytes to 64 (and
> >> clients taking that into account) can increase the throughput as well.
> >>
> >> Numbers gathered on j721e:
> >> echo 8000000 > /sys/module/dmatest/parameters/test_buf_size
> >> echo 2000 > /sys/module/dmatest/parameters/timeout
> >> echo 50 > /sys/module/dmatest/parameters/iterations
> >> echo 1 > /sys/module/dmatest/parameters/max_channels
> >>
> >> Prior this patch:       ~1.3 GB/s
> >> After this patch:       ~1.8 GB/s
> >>  with 1 byte alignment: ~1.7 GB/s
> >>
> >> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@...com>
> >> ---
> >>  drivers/dma/ti/k3-udma.c | 115 +++++++++++++++++++++++++++++++++++++--
> >>  1 file changed, 110 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
> >> index 87157cbae1b8..54e4ccb1b37e 100644
> >> --- a/drivers/dma/ti/k3-udma.c
> >> +++ b/drivers/dma/ti/k3-udma.c
> >> @@ -121,6 +121,11 @@ struct udma_oes_offsets {
> >>  #define UDMA_FLAG_PDMA_ACC32		BIT(0)
> >>  #define UDMA_FLAG_PDMA_BURST		BIT(1)
> >>  #define UDMA_FLAG_TDTYPE		BIT(2)
> >> +#define UDMA_FLAG_BURST_SIZE		BIT(3)
> >> +#define UDMA_FLAGS_J7_CLASS		(UDMA_FLAG_PDMA_ACC32 | \
> >> +					 UDMA_FLAG_PDMA_BURST | \
> >> +					 UDMA_FLAG_TDTYPE | \
> >> +					 UDMA_FLAG_BURST_SIZE)
> >>  
> >>  struct udma_match_data {
> >>  	enum k3_dma_type type;
> >> @@ -128,6 +133,7 @@ struct udma_match_data {
> >>  	bool enable_memcpy_support;
> >>  	u32 flags;
> >>  	u32 statictr_z_mask;
> >> +	u8 burst_size[3];
> >>  };
> >>  
> >>  struct udma_soc_data {
> >> @@ -436,6 +442,18 @@ static void k3_configure_chan_coherency(struct dma_chan *chan, u32 asel)
> >>  	}
> >>  }
> >>  
> >> +static u8 udma_get_chan_tpl_index(struct udma_tpl *tpl_map, int chan_id)
> >> +{
> >> +	int i;
> >> +
> >> +	for (i = 0; i < tpl_map->levels; i++) {
> >> +		if (chan_id >= tpl_map->start_idx[i])
> >> +			return i;
> >> +	}
> > 
> > Braces seem not required
> 
> True, they are not strictly needed but I prefer to have them when I have
> any condition in the loop.

ok

> >>  static void udma_reset_uchan(struct udma_chan *uc)
> >>  {
> >>  	memset(&uc->config, 0, sizeof(uc->config));
> >> @@ -1811,6 +1829,7 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
> >>  	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
> >>  	struct udma_tchan *tchan = uc->tchan;
> >>  	struct udma_rchan *rchan = uc->rchan;
> >> +	u8 burst_size = 0;
> >>  	int ret = 0;
> >>  
> >>  	/* Non synchronized - mem to mem type of transfer */
> >> @@ -1818,6 +1837,12 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
> >>  	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
> >>  	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
> >>  
> >> +	if (ud->match_data->flags & UDMA_FLAG_BURST_SIZE) {
> >> +		u8 tpl = udma_get_chan_tpl_index(&ud->tchan_tpl, tchan->id);
> > 
> > Can we define variable at function start please
> 
> The 'tpl' is only used within this if branch, it looks a bit cleaner
> imho, but if you insist, I can move the definition.

yeah lets be consistent and keep them at the start of the function
please

> >> +	switch (match_data->burst_size[tpl]) {
> >> +		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_256_BYTES:
> >> +			return DMAENGINE_ALIGN_256_BYTES;
> >> +		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_128_BYTES:
> >> +			return DMAENGINE_ALIGN_128_BYTES;
> >> +		case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES:
> >> +		fallthrough;
> >> +		default:
> >> +			return DMAENGINE_ALIGN_64_BYTES;
> > 
> > ah, we are supposed to have case at same indent as switch, pls run
> > checkpatch to have these flagged off
> 
> Yes, they should be.
> 
> The other me did a sloppy job for sure, this should have been screaming
> even without checkpatch...
> This has been done in a rush during the last days to close on the
> backlog item which got the most votes.

no worries, that is where reviews help :)

-- 
~Vinod

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ