lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 25 Oct 2022 16:31:03 +0000
From:   David Thompson <davthompson@...dia.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "cai.huoqing@...ux.dev" <cai.huoqing@...ux.dev>,
        "brgl@...ev.pl" <brgl@...ev.pl>, Liming Sun <limings@...dia.com>,
        Asmaa Mnebhi <asmaa@...dia.com>
Subject: RE: [PATCH net v1] mlxbf_gige: fix receive packet race condition

> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Monday, September 19, 2022 5:18 PM
> To: David Thompson <davthompson@...dia.com>
> Cc: davem@...emloft.net; edumazet@...gle.com; pabeni@...hat.com;
> netdev@...r.kernel.org; cai.huoqing@...ux.dev; brgl@...ev.pl; Liming Sun
> <limings@...dia.com>; Asmaa Mnebhi <asmaa@...dia.com>
> Subject: Re: [PATCH net v1] mlxbf_gige: fix receive packet race condition
> 
> On Thu, 8 Sep 2022 16:28:53 -0400 David Thompson wrote:
> > Under heavy traffic, the BF2 Gigabit interface can become unresponsive
> > for periods of time (several minutes) before eventually recovering.
> > This is due to a possible race condition in the mlxbf_gige_rx_packet
> > function, where the function exits with producer and consumer indices
> > equal but there are remaining packet(s) to be processed. In order to
> > prevent this situation, disable receive DMA during the processing of
> > received packets.
> 
> Pausing Rx DMA seems a little drastic, is the capacity of the NIC buffer large enough to sink the
> traffic while the stack drains the ring?
> 
> Could you provide a little more detail on what the HW issue is?
> There is no less intrusive way we can fix it?

Thank you for your insight Jakub.  I will review this patch and see if
it can be solved without pausing of the DMA process.

FYI, a little background on the DMA operation in hardware:

The pausing of RX DMA prevents writing new packets to memory.
New packets will be written to a 20KB buffer (but won't get forwarded to memory and no consumer index update). Once this buffer is full, packets will get dropped.  

Thanks, Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ