netdev - Re: [PATCH net-next v2 2/3] net: dsa: add Arrow SpeedChips XRS700x driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201127133753.4cf108cb@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com>
Date:   Fri, 27 Nov 2020 13:37:53 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     George McCollister <george.mccollister@...il.com>
Cc:     Vladimir Oltean <olteanv@...il.com>, Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S . Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        "open list:OPEN FIRMWARE AND..." <devicetree@...r.kernel.org>
Subject: Re: [PATCH net-next v2 2/3] net: dsa: add Arrow SpeedChips XRS700x
 driver

Replying to George's email 'cause I didn't get Vladimir's email from
the ML.

On Fri, 27 Nov 2020 14:58:29 -0600 George McCollister wrote:
> > 100 Kbps = 12.5KB/s.
> > sja1105 has 93 64-bit counters, and during every counter refresh cycle I  

Are these 93 for one port? That sounds like a lot.. There're usually 
~10 stats (per port) that are relevant to the standard netdev stats.

> Yeah, that's quite big. The xrs700x counters are only 16 bit. They
> need to be polled on an interval anyway or they will roll.

Yup! That's pretty common.

> > would need to get some counters from the beginning of that range, some
> > from the middle and some from the end. With all the back-and-forth
> > between the sja1105 driver and the SPI controller driver, and the
> > protocol overhead associated with creating a "SPI read" message, it is
> > all in all more efficient to just issue a burst read operation for all
> > the counters, even ones that I'm not going to use. So let's go with
> > that, 93x8 bytes (and ignore protocol overhead) = 744 bytes of SPI I/O
> > per second. At a throughput of 12.5KB/s, that takes 59 ms to complete,
> > and that's just for the raw I/O, that thing which keeps the SPI mutex
> > locked. You know what else I could do during that time? Anything else!
> > Like for example perform PTP timestamp reconstruction, which has a hard
> > deadline at 135 ms after the packet was received, and would appreciate
> > if the SPI mutex was not locked for 59 ms every second.  
> 
> Indeed, however if you need to acquire this data at all it's going to
> burden the system at that time so unless you're able to stretch out
> the reads over a length of time whether or not you're polling every
> second or once a day may not matter if you're never able to miss a
> deadline.

Exactly, either way you gotta prepare for users polling those stats.
A design where stats are read synchronously and user (an unprivileged
user, BTW) has the ability to disturb the operation of the system
sounds really flaky.

> > And all of that, for what benefit? Honestly any periodic I/O over the
> > management interface is too much I/O, unless there is any strong reason
> > to have it.  
> 
> True enough.
> 
> > Also, even the simple idea of providing out-of-date counters to user
> > space running in syscall context has me scratching my head. I can only
> > think of all the drivers in selftests that are checking statistics
> > counters before, then they send a packet, then they check the counters
> > after. What do those need to do, put a sleep to make sure the counters
> > were updated?  

Frankly life sounds simpler on the embedded networking planet than it is
on the one I'm living on ;) High speed systems are often eventually
consistent. Either because stats are gathered from HW periodically by
the FW, or RCU grace period has to expire, or workqueue has to run,
etc. etc. I know it's annoying for writing tests but it's manageable. 

If there is a better alternative I'm all ears but having /proc and
ifconfig return zeros for error counts while ip link doesn't will lead
to too much confusion IMO. While delayed update of stats is a fact of
life for _years_ now (hence it was backed into the ethtool -C API).