[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acc5fe70-ecb4-404c-9439-ff3181118983@molgen.mpg.de>
Date: Fri, 11 Jul 2025 23:14:50 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Brian Vazquez <brianvv@...gle.com>
Cc: Joshua A Hay <joshua.a.hay@...el.com>, intel-wired-lan@...ts.osuosl.org,
netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH net 0/5] idpf: replace Tx flow
scheduling buffer ring with buffer pool
Dear Brian,
Thank you for your reply.
Am 07.07.25 um 16:43 schrieb Brian Vazquez:
> O Mon, Jun 30, 2025 at 06:22:11PM +0200, Paul Menzel wrote:
>> Am 30.06.25 um 18:08 schrieb Hay, Joshua A:
>>
>>>> Am 25.06.25 um 18:11 schrieb Joshua Hay:
>>>>> This series fixes a stability issue in the flow scheduling Tx send/clean
>>>>> path that results in a Tx timeout.
>>>>>
>>>>> The existing guardrails in the Tx path were not sufficient to prevent
>>>>> the driver from reusing completion tags that were still in flight (held
>>>>> by the HW). This collision would cause the driver to erroneously clean
>>>>> the wrong packet thus leaving the descriptor ring in a bad state.
>>>>>
>>>>> The main point of this refactor is replace the flow scheduling buffer
>>>>
>>>> … to replace …?
>>>
>>> Thanks, will fix in v2
>>>
>>>>> ring with a large pool/array of buffers. The completion tag then simply
>>>>> is the index into this array. The driver tracks the free tags and pulls
>>>>> the next free one from a refillq. The cleaning routines simply use the
>>>>> completion tag from the completion descriptor to index into the array to
>>>>> quickly find the buffers to clean.
>>>>>
>>>>> All of the code to support the refactor is added first to ensure traffic
>>>>> still passes with each patch. The final patch then removes all of the
>>>>> obsolete stashing code.
>>>>
>>>> Do you have reproducers for the issue?
>>>
>>> This issue cannot be reproduced without the customer specific device
>>> configuration, but it can impact any traffic once in place.
>>
>> Interesting. Then it’d be great if you could describe that setup in more
>> detail.
> The hardware can process packets and return completions out of order;
> this depends on HW configuration that is difficult to replicate.
>
> To match completions with packets, each packet with pending completions
> must be associated to a unique ID. The previous code would occasionally
> reassigned the same ID to multiple pending packets, resulting in
> resource leaks and eventually panics.
Thank you for describing the problem again. Too bad it’s not easily
reproducible.
> The new code uses a much simpler data structure to assign IDs that
> is immune to duplicate assignment, and also much more efficient at
> runtime.
Maybe that could be added to the commit message too. How can the
efficiency claim be verified?
>>>>> Joshua Hay (5):
>>>>> idpf: add support for Tx refillqs in flow scheduling mode
>>>>> idpf: improve when to set RE bit logic
>>>>> idpf: replace flow scheduling buffer ring with buffer pool
>>>>> idpf: stop Tx if there are insufficient buffer resources
>>>>> idpf: remove obsolete stashing code
>>>>>
>>>>> .../ethernet/intel/idpf/idpf_singleq_txrx.c | 6 +-
>>>>> drivers/net/ethernet/intel/idpf/idpf_txrx.c | 626 ++++++------------
>>>>> drivers/net/ethernet/intel/idpf/idpf_txrx.h | 76 +--
>>>>> 3 files changed, 239 insertions(+), 469 deletions(-)
Kind regards,
Paul
Powered by blists - more mailing lists