lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240429183034.GG516117@kernel.org>
Date: Mon, 29 Apr 2024 19:30:34 +0100
From: Simon Horman <horms@...nel.org>
To: MD Danish Anwar <danishanwar@...com>
Cc: Dan Carpenter <dan.carpenter@...aro.org>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Andrew Lunn <andrew@...n.ch>, Jan Kiszka <jan.kiszka@...mens.com>,
	Diogo Ivo <diogo.ivo@...mens.com>, Paolo Abeni <pabeni@...hat.com>,
	Jakub Kicinski <kuba@...nel.org>,
	Eric Dumazet <edumazet@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, srk@...com,
	Vignesh Raghavendra <vigneshr@...com>, r-gunasekaran@...com,
	Roger Quadros <rogerq@...nel.org>
Subject: Re: [PATCH net-next v2] net: ti: icssg_prueth: Add SW TX / RX
 Coalescing based on hrtimers

On Mon, Apr 29, 2024 at 12:45:01PM +0530, MD Danish Anwar wrote:
> Add SW IRQ coalescing based on hrtimers for RX and TX data path for ICSSG
> driver, which can be enabled by ethtool commands:
> 
> - RX coalescing
>   ethtool -C eth1 rx-usecs 50
> 
> - TX coalescing can be enabled per TX queue
> 
>   - by default enables coalesing for TX0

nit: coalescing

Please consider running patches through ./checkpatch --codespell

>   ethtool -C eth1 tx-usecs 50
>   - configure TX0
>   ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100
>   - configure TX1
>   ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100
>   - configure TX0 and TX1
>   ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce
> tx-usecs 100
> 
> Minimum value for both rx-usecs and tx-usecs is 20us.
> 
> Compared to gro_flush_timeout and napi_defer_hard_irqs this patch allows
> to enable IRQ coalescing for RX path separately.
> 
> Benchmarking numbers:
>  ===============================================================
> | Method                  | Tput_TX | CPU_TX | Tput_RX | CPU_RX |
> | ==============================================================
> | Default Driver           943 Mbps    31%      517 Mbps  38%   |
> | IRQ Coalescing (Patch)   943 Mbps    28%      518 Mbps  25%   |
>  ===============================================================
> 
> Signed-off-by: MD Danish Anwar <danishanwar@...com>
> ---
> Changes from v1 [1] to v2:
> *) Added Benchmarking numbers in the commit message as suggested by
>    Andrew Lunn <andrew@...n.ch>. Full logs [2]
> *) Addressed comments given by Simon Horman <horms@...nel.org> in v1.

Sorry to be bothersome, but the W=1 problem isn't entirely fixed.

> 
> [1] https://lore.kernel.org/all/20240424091823.1814136-1-danishanwar@ti.com/
> 
> [2] https://gist.githubusercontent.com/danish-ti/47855631be9f3635cee994693662a988/raw/94b4eb86b42fe243ab03186a88a314e0cb272fd0/gistfile1.txt

...

> diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c

...

> @@ -190,19 +191,37 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
>  	return num_tx;
>  }
>  
> +static enum hrtimer_restart emac_tx_timer_callback(struct hrtimer *timer)
> +{
> +	struct prueth_tx_chn *tx_chns =
> +			container_of(timer, struct prueth_tx_chn, tx_hrtimer);
> +
> +	enable_irq(tx_chns->irq);
> +	return HRTIMER_NORESTART;
> +}
> +
>  static int emac_napi_tx_poll(struct napi_struct *napi_tx, int budget)
>  {
>  	struct prueth_tx_chn *tx_chn = prueth_napi_to_tx_chn(napi_tx);
>  	struct prueth_emac *emac = tx_chn->emac;
> +	bool tdown = false;
>  	int num_tx_packets;
>  
> -	num_tx_packets = emac_tx_complete_packets(emac, tx_chn->id, budget);
> +	num_tx_packets = emac_tx_complete_packets(emac, tx_chn->id, budget,
> +						  &tdown);
>  
>  	if (num_tx_packets >= budget)
>  		return budget;
>  
> -	if (napi_complete_done(napi_tx, num_tx_packets))
> -		enable_irq(tx_chn->irq);
> +	if (napi_complete_done(napi_tx, num_tx_packets)) {
> +		if (unlikely(tx_chn->tx_pace_timeout_ns && !tdown)) {
> +			hrtimer_start(&tx_chn->tx_hrtimer,
> +				      ns_to_ktime(tx_chn->tx_pace_timeout_ns),
> +				      HRTIMER_MODE_REL_PINNED);
> +		} else {
> +			enable_irq(tx_chn->irq);
> +		}

This compiles with gcc-13 and clang-18 W=1
(although the inner {} are unnecessary).

> +	}
>  
>  	return num_tx_packets;
>  }

...

> @@ -872,7 +894,13 @@ int emac_napi_rx_poll(struct napi_struct *napi_rx, int budget)
>  	}
>  
>  	if (num_rx < budget && napi_complete_done(napi_rx, num_rx))
> -		enable_irq(emac->rx_chns.irq[rx_flow]);
> +		if (unlikely(emac->rx_pace_timeout_ns)) {
> +			hrtimer_start(&emac->rx_hrtimer,
> +				      ns_to_ktime(emac->rx_pace_timeout_ns),
> +				      HRTIMER_MODE_REL_PINNED);
> +		} else {
> +			enable_irq(emac->rx_chns.irq[rx_flow]);
> +		}

But this does not; I think outer (but not inner) {} are needed.

FIIIW, I believe this doesn't show-up in the netdev automated testing
because this driver isn't built for x86 allmodconfig.

>  
>  	return num_rx;
>  }

...

-- 
pw-bot: changes-requested

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ