lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <679d6810-9e76-425c-9d4e-d4b372928cc3@linux.dev>
Date: Tue, 27 May 2025 12:16:33 -0400
From: Sean Anderson <sean.anderson@...ux.dev>
To: Suraj Gupta <suraj.gupta2@....com>, andrew+netdev@...n.ch,
 davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, vkoul@...nel.org, michal.simek@....com,
 radhey.shyam.pandey@....com, horms@...nel.org
Cc: netdev@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 linux-kernel@...r.kernel.org, git@....com, harini.katakam@....com
Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report
 coalesce parameters in DMAengine flow

On 5/25/25 06:22, Suraj Gupta wrote:
> Add support to configure / report interrupt coalesce count and delay via
> ethtool in DMAEngine flow.
> Netperf numbers are not good when using non-dmaengine default values,
> so tuned coalesce count and delay and defined separate default
> values in dmaengine flow.
> 
> Netperf numbers and CPU utilisation change in DMAengine flow after
> introducing coalescing with default parameters:
> coalesce parameters:
>    Transfer type	  Before(w/o coalescing)  After(with coalescing)
> TCP Tx, CPU utilisation%	925, 27			941, 22
> TCP Rx, CPU utilisation%	607, 32			741, 36
> UDP Tx, CPU utilisation%	857, 31			960, 28
> UDP Rx, CPU utilisation%	762, 26			783, 18
> 
> Above numbers are observed with 4x Cortex-a53.

How does this affect latency? I would expect these RX settings to
increase latency around 5-10x. I only use these settings with DIM since
it will disable coalescing during periods of light load for better
latency.

(of course the way to fix this in general is RSS or some other method
involving multiple queues).

> Signed-off-by: Suraj Gupta <suraj.gupta2@....com>
> ---
> This patch depend on following AXI DMA dmengine driver changes sent to
> dmaengine mailing list as pre-requisit series:
> https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.com/ 
> ---
>  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
>  .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> index 5ff742103beb..cdf6cbb6f2fd 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> @@ -126,6 +126,12 @@
>  #define XAXIDMA_DFT_TX_USEC		50
>  #define XAXIDMA_DFT_RX_USEC		16
>  
> +/* Default TX/RX Threshold and delay timer values for SGDMA mode with DMAEngine */
> +#define XAXIDMAENGINE_DFT_TX_THRESHOLD	16
> +#define XAXIDMAENGINE_DFT_TX_USEC	5
> +#define XAXIDMAENGINE_DFT_RX_THRESHOLD	24
> +#define XAXIDMAENGINE_DFT_RX_USEC	16
> +
>  #define XAXIDMA_BD_CTRL_TXSOF_MASK	0x08000000 /* First tx packet */
>  #define XAXIDMA_BD_CTRL_TXEOF_MASK	0x04000000 /* Last tx packet */
>  #define XAXIDMA_BD_CTRL_ALL_MASK	0x0C000000 /* All control bits */
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> index 1b7a653c1f4e..f9c7d90d4ecb 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct net_device *ndev)
>  {
>  	struct axienet_local *lp = netdev_priv(ndev);
>  	struct skbuf_dma_descriptor *skbuf_dma;
> +	struct dma_slave_config tx_config, rx_config;
>  	int i, ret;
>  
>  	lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0");
> @@ -1520,6 +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
>  		goto err_dma_release_tx;
>  	}
>  
> +	tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> +	tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> +	rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> +	rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;

I think it would be clearer to just do something like

	struct dma_slave_config tx_config = {
		.coalesce_cnt = 16,
		.coalesce_usecs = 5,
	};

since these are only used once. And this ensures that you initialize the
whole struct.

But what tree are you using? I don't see these members on net-next or
dmaengine.

> +	ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> +	if (ret) {
> +		dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> +		goto err_dma_release_tx;
> +	}
> +	ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> +	if (ret) {
> +		dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> +		goto err_dma_release_tx;
> +	}
> +
>  	lp->tx_ring_tail = 0;
>  	lp->tx_ring_head = 0;
>  	lp->rx_ring_tail = 0;
> @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device *ndev,
>  	struct axienet_local *lp = netdev_priv(ndev);
>  	u32 cr;
>  
> +	if (lp->use_dmaengine) {
> +		struct dma_slave_caps tx_caps, rx_caps;
> +
> +		dma_get_slave_caps(lp->tx_chan, &tx_caps);
> +		dma_get_slave_caps(lp->rx_chan, &rx_caps);
> +
> +		ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> +		ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> +		ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> +		ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> +		return 0;
> +	}
> +
>  	ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
>  
>  	spin_lock_irq(&lp->rx_cr_lock);
> @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device *ndev,
>  		return -EINVAL;
>  	}
>  
> +	if (lp->use_dmaengine)	{
> +		struct dma_slave_config tx_cfg, rx_cfg;
> +		int ret;
> +
> +		tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> +		tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> +		rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> +		rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
> +
> +		ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> +		if (ret) {
> +			NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
> +			return ret;
> +		}
> +
> +		ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> +		if (ret) {
> +			NL_SET_ERR_MSG(extack, "failed to set rx coalesce parameters");
> +			return ret;
> +		}
> +		return 0;
> +	}
> +
>  	if (new_dim && !old_dim) {
>  		cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
>  				     ecoalesce->rx_coalesce_usecs);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ