lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240105121447.11ae80d1@meshulam.tesarici.cz>
Date: Fri, 5 Jan 2024 12:14:47 +0100
From: Petr Tesařík <petr@...arici.cz>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Alexandre Torgue <alexandre.torgue@...s.st.com>, Jose Abreu
 <joabreu@...opsys.com>, "David S. Miller" <davem@...emloft.net>, Jakub
 Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Maxime
 Coquelin <mcoquelin.stm32@...il.com>, Chen-Yu Tsai <wens@...e.org>, Jernej
 Skrabec <jernej.skrabec@...il.com>, Samuel Holland <samuel@...lland.org>,
 "open list:STMMAC ETHERNET DRIVER" <netdev@...r.kernel.org>, "moderated
 list:ARM/STM32 ARCHITECTURE" <linux-stm32@...md-mailman.stormreply.com>,
 "moderated list:ARM/STM32 ARCHITECTURE"
 <linux-arm-kernel@...ts.infradead.org>, open list
 <linux-kernel@...r.kernel.org>, "open list:ARM/Allwinner sunXi SoC support"
 <linux-sunxi@...ts.linux.dev>
Subject: Re: [PATCH] net: stmmac: protect statistics updates with a spinlock

On Fri, 5 Jan 2024 11:48:19 +0100
Eric Dumazet <edumazet@...gle.com> wrote:

> On Fri, Jan 5, 2024 at 11:34 AM Petr Tesařík <petr@...arici.cz> wrote:
> >
> > On Fri, 5 Jan 2024 10:58:42 +0100
> > Eric Dumazet <edumazet@...gle.com> wrote:
> >  
> > > On Fri, Jan 5, 2024 at 10:16 AM Petr Tesarik <petr@...arici.cz> wrote:  
> > > >
> > > > Add a spinlock to fix race conditions while updating Tx/Rx statistics.
> > > >
> > > > As explained by a comment in <linux/u64_stats_sync.h>, write side of struct
> > > > u64_stats_sync must ensure mutual exclusion, or one seqcount update could
> > > > be lost on 32-bit platforms, thus blocking readers forever.
> > > >
> > > > Such lockups have been actually observed on 32-bit Arm after stmmac_xmit()
> > > > on one core raced with stmmac_napi_poll_tx() on another core.
> > > >
> > > > Signed-off-by: Petr Tesarik <petr@...arici.cz>  
> > >
> > > This is going to add more costs to 64bit platforms ?  
> >
> > Yes, it adds a (hopefully not too contended) spinlock and in most
> > places an interrupt disable/enable pair.
> >
> > FWIW the race condition is also present on 64-bit platforms, resulting
> > in inaccurate statistic counters. I can understand if you consider it a
> > mild annoyance, not worth fixing.
> >  
> > > It seems to me that the same syncp can be used from two different
> > > threads : hard irq and napi poller...  
> >
> > Yes, that's exactly the scenario that locks up my system.
> >  
> > > At this point, I do not see why you keep linux/u64_stats_sync.h if you
> > > decide to go for a spinlock...  
> >
> > The spinlock does not havce to be taken on the reader side, so the
> > seqcounter still adds some value.
> >  
> > > Alternative would use atomic64_t fields for the ones where there is no
> > > mutual exclusion.
> > >
> > > RX : napi poll is definitely safe (protected by an atomic bit)
> > > TX : each TX queue is also safe (protected by an atomic exclusion for
> > > non LLTX drivers)
> > >
> > > This leaves the fields updated from hardware interrupt context ?  
> >
> > I'm afraid I don't have enough network-stack-foo to follow here.
> >
> > My issue on 32 bit is that stmmac_xmit() may be called directly from
> > process context while another core runs the TX napi on the same channel
> > (in interrupt context). I didn't observe any race on the RX path, but I
> > believe it's possible with NAPI busy polling.
> >
> > In any case, I don't see the connection with LLTX. Maybe you want to
> > say that the TX queue is safe for stmmac (because it is a non-LLTX
> > driver), but might not be safe for LLTX drivers?  
> 
> LLTX drivers (mostly virtual drivers like tunnels...) can have multiple cpus
> running ndo_start_xmit() concurrently. So any use of a 'shared syncp'
> would be a bug.
> These drivers usually use per-cpu stats, to avoid races and false
> sharing anyway.
> 
> I think you should split the structures into two separate groups, each
> guarded with its own syncp.
> 
> No extra spinlocks, no extra costs on 64bit arches...
> 
> If TX completion can run in parallel with ndo_start_xmit(), then
> clearly we have to split stmmac_txq_stats in two halves:

Oh, now I get it. Yes, that's much better, indeed.

I mean, the counters have never been consistent (due to the race on the
writer side), and nobody is concerned. So, there is no value in taking
a consistent snapshot in stmmac_get_ethtool_stats().

I'm going to rework and retest my patch. Thank you for pointing me in
the right direction!

Petr T

> Also please note the conversion from u64 to u64_stats_t

Noted. IIUC this will in turn close the update race on 64-bit by using
an atomic type and on 32-bit by using a seqlock. Clever.

Petr T

> Very partial patch, only to show the split and new structure :
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h
> b/drivers/net/ethernet/stmicro/stmmac/common.h
> index e3f650e88f82f927f0dcf95748fbd10c14c30cbe..702bceea5dc8c875a80f5e3a92b7bb058f373eda
> 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> @@ -60,16 +60,22 @@
>  /* #define FRAME_FILTER_DEBUG */
> 
>  struct stmmac_txq_stats {
> -       u64 tx_bytes;
> -       u64 tx_packets;
> -       u64 tx_pkt_n;
> -       u64 tx_normal_irq_n;
> -       u64 napi_poll;
> -       u64 tx_clean;
> -       u64 tx_set_ic_bit;
> -       u64 tx_tso_frames;
> -       u64 tx_tso_nfrags;
> -       struct u64_stats_sync syncp;
> +/* First part, updated from ndo_start_xmit(), protected by tx queue lock */
> +       struct u64_stats_sync syncp_tx;
> +       u64_stats_t tx_bytes;
> +       u64_stats_t tx_packets;
> +       u64_stats_t tx_pkt_n;
> +       u64_stats_t tx_tso_frames;
> +       u64_stats_t tx_tso_nfrags;
> +
> +/* Second part, updated from TX completion (protected by NAPI poll logic) */
> +       struct u64_stats_sync syncp_tx_completion;
> +       u64_stats_t napi_poll;
> +       u64_stats_t tx_clean;
> +       u64_stats_t tx_set_ic_bit;
> +
> +/* Following feld is updated from hard irq context... */
> +       atomic64_t tx_normal_irq_n;
>  } ____cacheline_aligned_in_smp;
> 
>  struct stmmac_rxq_stats {


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ