lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <adfdeee3-b200-4700-800f-c71a0b82697f@intel.com>
Date: Fri, 27 Jun 2025 15:13:37 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Joshua Hay <joshua.a.hay@...el.com>
CC: <intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>, "Madhu
 Chittim" <madhu.chittim@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH net 1/5] idpf: add support for Tx
 refillqs in flow scheduling mode

From: Joshua Hay <joshua.a.hay@...el.com>
Date: Wed, 25 Jun 2025 09:11:52 -0700

> This is the start of a 5 patch series intended to fix a stability issue
> in the flow scheduling Tx send/clean path that results in a Tx timeout.

No need to mention "series", "start", "patch" in commit messages.

> 
> In certain production environments, it is possible for completion tags
> to collide, meaning N packets with the same completion tag are in flight
> at the same time. In this environment, any given Tx queue is effectively
> used to send both slower traffic and higher throughput traffic
> simultaneously. This is the result of a customer's specific
> configuration in the device pipeline, the details of which Intel cannot
> provide. This configuration results in a small number of out-of-order
> completions, i.e., a small number of packets in flight. The existing
> guardrails in the driver only protect against a large number of packets
> in flight. The slower flow completions are delayed which causes the
> out-of-order completions. Meanwhile, the fast flow exhausts the pool of
> unique tags and starts reusing tags. The next packet in the fast flow
> uses the same tag for a packet that is still in flight from the slower
> flow. The driver has no idea which packet it should clean when it
> processes the completion with that tag, but it will for the packet on
> the buffer ring before the hash table.  If the slower flow packet
> completion is processed first, it will end up cleaning the fast flow
> packet on the ring prematurely. This leaves the descriptor ring in a bad
> state resulting in a Tx timeout.
> 
> This series refactors the Tx buffer management by replacing the stashing

Same.

> mechanisms and the tag generation with a large pool/array of unique
> tags. The completion tags are now simply used to index into the pool of
> Tx buffers. This implicitly prevents any tag from being reused while
> it's in flight.
> 
> First, we need a new mechanism for the send path to know what tag to use
> next. The driver will allocate and initialize a refillq for each TxQ
> with all of the possible free tag values. During send, the driver grabs
> the next free tag from the refillq from next_to_clean. While cleaning
> the packet, the clean routine posts the tag back to the refillq's
> next_to_use to indicate that it is now free to use.
> 
> This mechanism works exactly the same way as the existing Rx refill
> queues, which post the cleaned buffer IDs back to the buffer queue to be
> reposted to HW. Since we're using the refillqs for both Rx and Tx now,
> genercize some of the existing refillq support.
> 
> Note: the refillqs will not be used yet. This is only demonstrating how
> they will be used to pass free tags back to the send path.

[...]

> @@ -267,6 +270,31 @@ static int idpf_tx_desc_alloc(const struct idpf_vport *vport,
>  	tx_q->next_to_clean = 0;
>  	idpf_queue_set(GEN_CHK, tx_q);
>  
> +	if (idpf_queue_has(FLOW_SCH_EN, tx_q)) {

	if (!idpf_queue_has(FLOW_SCH_EN, tx_q))
		return 0;

> +		struct idpf_sw_queue *refillq = tx_q->refillq;
> +
> +		refillq->desc_count = tx_q->desc_count;
> +
> +		refillq->ring = kcalloc(refillq->desc_count, sizeof(u32),
> +					GFP_KERNEL);
> +		if (!refillq->ring) {
> +			err = -ENOMEM;
> +			goto err_alloc;
> +		}
> +
> +		for (u32 i = 0; i < refillq->desc_count; i++)
> +			refillq->ring[i] =
> +				FIELD_PREP(IDPF_RFL_BI_BUFID_M, i) |
> +				FIELD_PREP(IDPF_RFL_BI_GEN_M,
> +					   idpf_queue_has(GEN_CHK, refillq));
> +
> +		/*
> +		 * Go ahead and flip the GEN bit since this counts as filling
> +		 * up the ring, i.e. we already ring wrapped.
> +		 */
> +		idpf_queue_change(GEN_CHK, refillq);
> +	}
> +
>  	return 0;
>  
>  err_alloc:

Thanks,
Olek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ