netdev - Re: [net 02/11] net/mlx5e: Fix multicast counter not up-to-date in "ip -s"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a8ef2aece592d352dd6bd978db2d430ce55826ed.camel@mellanox.com>
Date:   Mon, 6 Jul 2020 19:40:50 +0000
From:   Saeed Mahameed <saeedm@...lanox.com>
To:     "kuba@...nel.org" <kuba@...nel.org>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        Tariq Toukan <tariqt@...lanox.com>,
        Ron Diskin <rondi@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [net 02/11] net/mlx5e: Fix multicast counter not up-to-date in
 "ip -s"

On Fri, 2020-07-03 at 10:59 -0700, Jakub Kicinski wrote:
> On Fri, 3 Jul 2020 06:15:09 +0000 Saeed Mahameed wrote:
> > > > To read mcast counter we need to execute FW command which is
> > > > blocking,
> > > > we can't block in atomic context .ndo_get_stats64 :( .. we have
> > > > to
> > > > count in SW. 
> > > > 
> > > > the previous approach wasn't accurate as we read the mcast
> > > > counter
> > > > in a
> > > > background thread triggered by the previous read.. so we were
> > > > off
> > > > by
> > > > the interval between two reads.  
> > > 
> > > And that's bad enough to cause trouble? What's the worst case
> > > time
> > > delta you're seeing?
> > 
> > Depends on the user frequency to read stats,
> > if you read stats once every 5 minutes then mcast stats are off by
> > 5
> > minutes..
> > 
> > Just thinking out loud, is it ok of we busy loop and wait for FW
> > response for mcast stats commands ? 
> > 
> > In ethtool -S though, they are accurate since we grab them on the
> > spot
> > from FW.
> 
> I don't really feel too strongly, I'm just trying to get the details
> because I feel like the situation is going to be increasingly common.
> It'd be quite sad if drivers had to reimplement all stats in sw.
> 

Depends on HW, our HW/FW supports providing stats per (Vport/function).
which means if a packet got lost between the NIC and the netdev queue,
it will be counted as rx-packet/mcast, although we have a private
counter to show this drop in ethtool but will be counted in rx counter
in netdev stats, if we used hw stats.

so this is why i always prefer SW stats for netdev reported stats, all
we need to count in SW {rx,tx} X {packets, bytes} + rx mcast packets.

This gives more flexibility and correctness, any given HW can create
multiple netdevs on the same function, we need the netdev stats to
reflect traffic that only went through that netdev.

> I thought it would be entirely reasonable for the driver to read the
> stats from a delayed work every 1/2 HZ and cache that. We do have a
> knob in ethtool IRQ coalescing settings for stats writeback
> frequency.
> 

Some customers didn't like this since for drivers that implement this
their CPU power utilization will be slightly higher on idle.

> I'm not sure what locks procfs actually holds, if its something that
> could impact reading other files - it'd probably be a bad idea to
> busy
> wait.

Agreed.