[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201127141402.417933f3@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com>
Date: Fri, 27 Nov 2020 14:14:02 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Andrew Lunn <andrew@...n.ch>
Cc: Vladimir Oltean <olteanv@...il.com>,
George McCollister <george.mccollister@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
"David S . Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
"open list:OPEN FIRMWARE AND..." <devicetree@...r.kernel.org>
Subject: Re: [PATCH net-next v2 2/3] net: dsa: add Arrow SpeedChips XRS700x
driver
On Fri, 27 Nov 2020 22:32:44 +0100 Andrew Lunn wrote:
> > > So long as these counters are still in ethtool -S, i guess it does not
> > > matter. That i do trust to be accurate, and probably consistent across
> > > the counters it returns.
> >
> > Not in the NIC designs I'm familiar with.
>
> Many NICs have a way to take a hardware snapshot of all counters.
> You can then read them out as fast or slow as you want, since you
> read the snapshot, not the live counters. As a result you can compare
> counters against each other.
Curious, does Marvell HW do it?
> > But anyway - this only matters in some strict testing harness,
> > right? Normal users will look at a stats after they noticed issues
> > (so minutes / hours later) or at the very best they'll look at a
> > graph, which will hardly require <1sec accuracy to when error
> > occurred.
>
> As Vladimir has pointed out, polling once per second over an i2c bus
> is expensive. And there is an obvious linear cost with the number of
> ports on these switches. And we need to keep latency down so that PTP
> is accurate. Do we really want to be polling, for something which is
> very unlikely to be used?
IDK I find it very questionable if the system design doesn't take into
account that statistics are retrieved every n seconds. We can perhaps
scale the default period with the speed of the bus?
> I think we should probably take another look at the locking and see
> if it can be modified to allow block, so we can avoid this wasteful
> polling.
It'd be great. Worst case scenario we can have very, very rare polling
+ a synchronous callback? But we shouldn't leave /proc be completely
incorrect.
Converting /proc to be blocking may be a little risky, there may be
legacy daemons, and other software which people have running which reads
/proc just to get a list of interfaces or something silly, which will
suddenly start causing latencies to the entire stack.
But perhaps we can try and find out. I'm not completely opposed.
Powered by blists - more mailing lists