netdev - Re: [PATCH v4 net 4/4] net: enetc: disable NAPI after all rings are disabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241011144204.4gpywu2i2ygyk26v@skbuf>
Date: Fri, 11 Oct 2024 17:42:04 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: Wei Fang <wei.fang@....com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, claudiu.manoil@....com, ast@...nel.org,
	daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	bpf@...r.kernel.org, stable@...r.kernel.org, imx@...ts.linux.dev,
	rkannoth@...vell.com, maciej.fijalkowski@...el.com,
	sbhatta@...vell.com
Subject: Re: [PATCH v4 net 4/4] net: enetc: disable NAPI after all rings are
 disabled

On Thu, Oct 10, 2024 at 05:20:56PM +0800, Wei Fang wrote:
> When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
> on LS1028A, it was found that if the command was re-run multiple times,
> Rx could not receive the frames, and the result of xdp-bench showed
> that the rx rate was 0.
> 
> root@...028ardb:~# ./xdp-bench tx eno0
> Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
> Summary                      2046 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> Summary                         0 rx/s                  0 err,drop/s
> 
> By observing the Rx PIR and CIR registers, CIR is always 0x7FF and
> PIR is always 0x7FE, which means that the Rx ring is full and can no
> longer accommodate other Rx frames. Therefore, the problem is caused
> by the Rx BD ring not being cleaned up.
> 
> Further analysis of the code revealed that the Rx BD ring will only
> be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
> Therefore, some debug logs were added to the driver and the current
> values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
> BD ring was full. The logs are as follows.
> 
> [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
> [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
> [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
> 
> From the results, the max value of xdp_tx_in_flight has reached 2140.
> However, the size of the Rx BD ring is only 2048. So xdp_tx_in_flight
> did not drop to 0 after enetc_stop() is called and the driver does not
> clear it. The root cause is that NAPI is disabled too aggressively,
> without having waited for the pending XDP_TX frames to be transmitted,
> and their buffers recycled, so that xdp_tx_in_flight cannot naturally
> drop to 0. Later, enetc_free_tx_ring() does free those stale, unsent
> XDP_TX packets, but it is not coded up to also reset xdp_tx_in_flight,
> hence the manifestation of the bug.
> 
> One option would be to cover this extra condition in enetc_free_tx_ring(),
> but now that the ENETC_TX_DOWN exists, we have created a window at
> the beginning of enetc_stop() where NAPI can still be scheduled, but
> any concurrent enqueue will be blocked. Therefore, enetc_wait_bdrs()
> and enetc_disable_tx_bdrs() can be called with NAPI still scheduled,
> and it is guaranteed that this will not wait indefinitely, but instead
> give us an indication that the pending TX frames have orderly dropped
> to zero. Only then should we call napi_disable().
> 
> This way, enetc_free_tx_ring() becomes entirely redundant and can be
> dropped as part of subsequent cleanup.
> 
> The change also refactors enetc_start() so that it looks like the
> mirror opposite procedure of enetc_stop().
> 
> Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing")
> Cc: stable@...r.kernel.org
> Signed-off-by: Wei Fang <wei.fang@....com>
> ---
> v2 changes:
> 1. Modify the titile and rephrase the commit meesage.
> 2. Use the new solution as described in the title
> v3: no changes.
> v4 changes:
> 1. Modify the title and rephrase the commit message.
> ---

Reviewed-by: Vladimir Oltean <vladimir.oltean@....com>
Tested-by: Vladimir Oltean <vladimir.oltean@....com>