lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87edlup75k.fsf@intel.com>
Date: Thu, 29 Jun 2023 09:25:43 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Florian Kauer <florian.kauer@...utronix.de>, Jesse Brandeburg
 <jesse.brandeburg@...el.com>, Tony Nguyen <anthony.l.nguyen@...el.com>,
 "David S . Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
 <pabeni@...hat.com>, Vedang Patel <vedang.patel@...el.com>, Maciej
 Fijalkowski <maciej.fijalkowski@...el.com>, Jithu Joseph
 <jithu.joseph@...el.com>, Andre Guedes <andre.guedes@...el.com>, Simon
 Horman <simon.horman@...igine.com>
Cc: netdev@...r.kernel.org, kurt@...utronix.de,
 intel-wired-lan@...ts.osuosl.org, linux-kernel@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH net v2] igc: Prevent garbled TX queue
 with XDP ZEROCOPY

Florian Kauer <florian.kauer@...utronix.de> writes:

> Hi Vinicius,
>
> On 28.06.23 23:34, Vinicius Costa Gomes wrote:
>> Florian Kauer <florian.kauer@...utronix.de> writes:
>> 
>>> In normal operation, each populated queue item has
>>> next_to_watch pointing to the last TX desc of the packet,
>>> while each cleaned item has it set to 0. In particular,
>>> next_to_use that points to the next (necessarily clean)
>>> item to use has next_to_watch set to 0.
>>>
>>> When the TX queue is used both by an application using
>>> AF_XDP with ZEROCOPY as well as a second non-XDP application
>>> generating high traffic, the queue pointers can get in
>>> an invalid state where next_to_use points to an item
>>> where next_to_watch is NOT set to 0.
>>>
>>> However, the implementation assumes at several places
>>> that this is never the case, so if it does hold,
>>> bad things happen. In particular, within the loop inside
>>> of igc_clean_tx_irq(), next_to_clean can overtake next_to_use.
>>> Finally, this prevents any further transmission via
>>> this queue and it never gets unblocked or signaled.
>>> Secondly, if the queue is in this garbled state,
>>> the inner loop of igc_clean_tx_ring() will never terminate,
>>> completely hogging a CPU core.
>>>
>>> The reason is that igc_xdp_xmit_zc() reads next_to_use
>>> before acquiring the lock, and writing it back
>>> (potentially unmodified) later. If it got modified
>>> before locking, the outdated next_to_use is written
>>> pointing to an item that was already used elsewhere
>>> (and thus next_to_watch got written).
>>>
>>> Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy")
>>> Signed-off-by: Florian Kauer <florian.kauer@...utronix.de>
>>> Reviewed-by: Kurt Kanzenbach <kurt@...utronix.de>
>>> Tested-by: Kurt Kanzenbach <kurt@...utronix.de>
>>> ---
>> 
>> This patch doesn't directly apply because there's a small conflict with
>> commit 95b681485563 ("igc: Avoid transmit queue timeout for XDP"),
>> but really easy to solve.
>> 
>> Anyway, good catch:
>> 
>> Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
>
> I am sorry, that was bad timing. I prepared the initial patch on Friday and overlooked the merge.
> Shall I send a v3 or will someone else take care of the conflict
> resolution?

I think it's easier if you send a v3.


Cheers,
-- 
Vinicius

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ