[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240108161241.625df103@meshulam.tesarici.cz>
Date: Mon, 8 Jan 2024 16:12:41 +0100
From: Petr Tesařík <petr@...arici.cz>
To: Andrew Lunn <andrew@...n.ch>
Cc: David Laight <David.Laight@...lab.com>, Eric Dumazet
<edumazet@...gle.com>, Alexandre Torgue <alexandre.torgue@...s.st.com>,
Jose Abreu <joabreu@...opsys.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Maxime
Coquelin <mcoquelin.stm32@...il.com>, Chen-Yu Tsai <wens@...e.org>, Jernej
Skrabec <jernej.skrabec@...il.com>, Samuel Holland <samuel@...lland.org>,
"open list:STMMAC ETHERNET DRIVER" <netdev@...r.kernel.org>, "moderated
list:ARM/STM32 ARCHITECTURE" <linux-stm32@...md-mailman.stormreply.com>,
"moderated list:ARM/STM32 ARCHITECTURE"
<linux-arm-kernel@...ts.infradead.org>, open list
<linux-kernel@...r.kernel.org>, "open list:ARM/Allwinner sunXi SoC support"
<linux-sunxi@...ts.linux.dev>, Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH] net: stmmac: protect statistics updates with a spinlock
On Mon, 8 Jan 2024 14:41:10 +0100
Andrew Lunn <andrew@...n.ch> wrote:
> > > You might want to consider per CPU statistics. Since each CPU has its
> > > own structure of statistics, you don't need atomic.
> > >
> > > The code actually using the statistics then needs to sum up the per
> > > CPU statistics, and using syncp should be sufficient for that.
> >
> > Doesn't that consume rather a lot of memory on systems with
> > 'silly numbers' of cpu?
>
> Systems with silly number of CPUS tend to also have silly amounts of
> memory. We are talking about maybe a dozen u64 here. So the memory
> usage goes from 144 bytes, to 144K for a 1024CPU system. Is 144K
> excessive for such a system?
I'm not even sure it's worth converting _all_ statistic counters to
per-CPU variables. Most of them are already guarded by a lock (either
the queue lock, or NAPI scheduling). Only the hard interrupt counter is
not protected by anything, so it's more like 8k on a 1024-CPU system....
> > Updating an atomic_t is (pretty much) the same as taking a lock.
> > unlock() is also likely to also contain an atomic operation.
> > So if you update more than two atomic_t it is likely that a lock
> > will be faster.
>
> True, but all those 1024 CPUs in your silly system get affected by a
> lock or an atomic. They all need to do something with there L1 and L2
> cache when using atomics. Spending an extra 144K of RAM means the
> other 1023 CPUs don't notice anything at all during the increment
> phase, which could be happening 1M times a second. They only get
> involved when something in user space wants the statistics, so maybe
> once per second from the SNMP agent.
>
> Also, stmmac is not used on silly CPU systems. It used in embedded
> systems. I doubt its integrated into anything with more than 8 CPUs.
I also doubt it as of today, but hey, it seems that more CPU cores is
the future of embedded. Ten years ago, who would have imagined putting
an 8-core CPU into a smartphone? OTOH who would have imagined a
smartphone with 24G of RAM...
Petr T
Powered by blists - more mailing lists