[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b64dc5c7-600c-66db-d125-2d747a21c1d8@linutronix.de>
Date: Thu, 29 Jun 2023 09:07:40 +0200
From: Florian Kauer <florian.kauer@...utronix.de>
To: Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vedang Patel <vedang.patel@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jithu Joseph <jithu.joseph@...el.com>,
Andre Guedes <andre.guedes@...el.com>,
Simon Horman <simon.horman@...igine.com>
Cc: netdev@...r.kernel.org, kurt@...utronix.de,
intel-wired-lan@...ts.osuosl.org, linux-kernel@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue
with XDP ZEROCOPY
Hi Vinicius,
On 28.06.23 23:34, Vinicius Costa Gomes wrote:
> Florian Kauer <florian.kauer@...utronix.de> writes:
>
>> In normal operation, each populated queue item has
>> next_to_watch pointing to the last TX desc of the packet,
>> while each cleaned item has it set to 0. In particular,
>> next_to_use that points to the next (necessarily clean)
>> item to use has next_to_watch set to 0.
>>
>> When the TX queue is used both by an application using
>> AF_XDP with ZEROCOPY as well as a second non-XDP application
>> generating high traffic, the queue pointers can get in
>> an invalid state where next_to_use points to an item
>> where next_to_watch is NOT set to 0.
>>
>> However, the implementation assumes at several places
>> that this is never the case, so if it does hold,
>> bad things happen. In particular, within the loop inside
>> of igc_clean_tx_irq(), next_to_clean can overtake next_to_use.
>> Finally, this prevents any further transmission via
>> this queue and it never gets unblocked or signaled.
>> Secondly, if the queue is in this garbled state,
>> the inner loop of igc_clean_tx_ring() will never terminate,
>> completely hogging a CPU core.
>>
>> The reason is that igc_xdp_xmit_zc() reads next_to_use
>> before acquiring the lock, and writing it back
>> (potentially unmodified) later. If it got modified
>> before locking, the outdated next_to_use is written
>> pointing to an item that was already used elsewhere
>> (and thus next_to_watch got written).
>>
>> Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy")
>> Signed-off-by: Florian Kauer <florian.kauer@...utronix.de>
>> Reviewed-by: Kurt Kanzenbach <kurt@...utronix.de>
>> Tested-by: Kurt Kanzenbach <kurt@...utronix.de>
>> ---
>
> This patch doesn't directly apply because there's a small conflict with
> commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"),
> but really easy to solve.
>
> Anyway, good catch:
>
> Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge.
Shall I send a v3 or will someone else take care of the conflict resolution?
Greetings,
Florian
Powered by blists - more mailing lists