lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acc5fe70-ecb4-404c-9439-ff3181118983@molgen.mpg.de>
Date: Fri, 11 Jul 2025 23:14:50 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Brian Vazquez <brianvv@...gle.com>
Cc: Joshua A Hay <joshua.a.hay@...el.com>, intel-wired-lan@...ts.osuosl.org,
 netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH net 0/5] idpf: replace Tx flow
 scheduling buffer ring with buffer pool

Dear Brian,


Thank you for your reply.

Am 07.07.25 um 16:43 schrieb Brian Vazquez:
> O Mon, Jun 30, 2025 at 06:22:11PM +0200, Paul Menzel wrote:

>> Am 30.06.25 um 18:08 schrieb Hay, Joshua A:
>>
>>>> Am 25.06.25 um 18:11 schrieb Joshua Hay:
>>>>> This series fixes a stability issue in the flow scheduling Tx send/clean
>>>>> path that results in a Tx timeout.
>>>>>
>>>>> The existing guardrails in the Tx path were not sufficient to prevent
>>>>> the driver from reusing completion tags that were still in flight (held
>>>>> by the HW).  This collision would cause the driver to erroneously clean
>>>>> the wrong packet thus leaving the descriptor ring in a bad state.
>>>>>
>>>>> The main point of this refactor is replace the flow scheduling buffer
>>>>
>>>> … to replace …?
>>>
>>> Thanks, will fix in v2
>>>
>>>>> ring with a large pool/array of buffers.  The completion tag then simply
>>>>> is the index into this array.  The driver tracks the free tags and pulls
>>>>> the next free one from a refillq.  The cleaning routines simply use the
>>>>> completion tag from the completion descriptor to index into the array to
>>>>> quickly find the buffers to clean.
>>>>>
>>>>> All of the code to support the refactor is added first to ensure traffic
>>>>> still passes with each patch.  The final patch then removes all of the
>>>>> obsolete stashing code.
>>>>
>>>> Do you have reproducers for the issue?
>>>
>>> This issue cannot be reproduced without the customer specific device
>>> configuration, but it can impact any traffic once in place.
>>
>> Interesting. Then it’d be great if you could describe that setup in more
>> detail.

> The hardware can process packets and return completions out of order;
> this depends on HW configuration that is difficult to replicate.
> 
> To match completions with packets, each packet with pending completions
> must be associated to a unique ID.  The previous code would occasionally
> reassigned the same ID to multiple pending packets, resulting in
> resource leaks and eventually panics.

Thank you for describing the problem again. Too bad it’s not easily 
reproducible.

> The new code uses a much simpler data structure to assign IDs that
> is immune to duplicate assignment, and also much more efficient at
> runtime.

Maybe that could be added to the commit message too. How can the 
efficiency claim be verified?

>>>>> Joshua Hay (5):
>>>>>      idpf: add support for Tx refillqs in flow scheduling mode
>>>>>      idpf: improve when to set RE bit logic
>>>>>      idpf: replace flow scheduling buffer ring with buffer pool
>>>>>      idpf: stop Tx if there are insufficient buffer resources
>>>>>      idpf: remove obsolete stashing code
>>>>>
>>>>>     .../ethernet/intel/idpf/idpf_singleq_txrx.c   |   6 +-
>>>>>     drivers/net/ethernet/intel/idpf/idpf_txrx.c   | 626 ++++++------------
>>>>>     drivers/net/ethernet/intel/idpf/idpf_txrx.h   |  76 +--
>>>>>     3 files changed, 239 insertions(+), 469 deletions(-)

Kind regards,

Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ