[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB62978F861DC2E4B7F56E1E899B2BA@SJ1PR11MB6297.namprd11.prod.outlook.com>
Date: Tue, 12 Aug 2025 16:45:25 +0000
From: "Salin, Samuel" <samuel.salin@...el.com>
To: "Hay, Joshua A" <joshua.a.hay@...el.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Hay, Joshua A"
<joshua.a.hay@...el.com>, "Chittim, Madhu" <madhu.chittim@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-net v3 1/6] idpf: add support for Tx
refillqs in flow scheduling mode
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf Of
> Joshua Hay
> Sent: Friday, July 25, 2025 11:42 AM
> To: intel-wired-lan@...ts.osuosl.org
> Cc: netdev@...r.kernel.org; Hay, Joshua A <joshua.a.hay@...el.com>;
> Chittim, Madhu <madhu.chittim@...el.com>
> Subject: [Intel-wired-lan] [PATCH iwl-net v3 1/6] idpf: add support for Tx
> refillqs in flow scheduling mode
>
> In certain production environments, it is possible for completion tags to
> collide, meaning N packets with the same completion tag are in flight at the
> same time. In this environment, any given Tx queue is effectively used to send
> both slower traffic and higher throughput traffic simultaneously. This is the
> result of a customer's specific configuration in the device pipeline, the details
> of which Intel cannot provide. This configuration results in a small number of
> out-of-order completions, i.e., a small number of packets in flight. The existing
> guardrails in the driver only protect against a large number of packets in flight.
> The slower flow completions are delayed which causes the out-of-order
> completions. The fast flow will continue sending traffic and generating tags.
> Because tags are generated on the fly, the fast flow eventually uses the same
> tag for a packet that is still in flight from the slower flow. The driver has no idea
> which packet it should clean when it processes the completion with that tag,
> but it will look for the packet on the buffer ring before the hash table. If the
> slower flow packet completion is processed first, it will end up cleaning the fast
> flow packet on the ring prematurely. This leaves the descriptor ring in a bad
> state resulting in a crashes or Tx timeout.
>
> In summary, generating a tag when a packet is sent can lead to the same tag
> being associated with multiple packets. This can lead to resource leaks,
> crashes, and/or Tx timeouts.
>
> Before we can replace the tag generation, we need a new mechanism for the
> send path to know what tag to use next. The driver will allocate and initialize a
> refillq for each TxQ with all of the possible free tag values. During send, the
> driver grabs the next free tag from the refillq from next_to_clean. While
> cleaning the packet, the clean routine posts the tag back to the refillq's
> next_to_use to indicate that it is now free to use.
>
> This mechanism works exactly the same way as the existing Rx refill queues,
> which post the cleaned buffer IDs back to the buffer queue to be reposted to
> HW. Since we're using the refillqs for both Rx and Tx now, genercize some of
> the existing refillq support.
>
> Note: the refillqs will not be used yet. This is only demonstrating how they will
> be used to pass free tags back to the send path.
>
> Signed-off-by: Joshua Hay <joshua.a.hay@...el.com>
> Reviewed-by: Madhu Chittim <madhu.chittim@...el.com>
> ---
> v2:
> - reorder refillq init logic to reduce indentation
> - don't drop skb if get free bufid fails, increment busy counter
> - add missing unlikely
> ---
> 2.39.2
Tested-by: Samuel Salin <Samuel.salin@...el.com>
Powered by blists - more mailing lists