netdev - Re: [PATCH net-next v2 2/9] libeth: add common queue stats

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240822161718.22a1840e@kernel.org>
Date: Thu, 22 Aug 2024 16:17:18 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: Tony Nguyen <anthony.l.nguyen@...el.com>, <davem@...emloft.net>,
 <pabeni@...hat.com>, <edumazet@...gle.com>, <netdev@...r.kernel.org>,
 <przemyslaw.kitszel@...el.com>, <joshua.a.hay@...el.com>,
 <michal.kubiak@...el.com>, <nex.sw.ncis.osdt.itp.upstreaming@...el.com>
Subject: Re: [PATCH net-next v2 2/9] libeth: add common queue stats

On Thu, 22 Aug 2024 17:13:57 +0200 Alexander Lobakin wrote:
> > BTW for Intel? Or you want this to be part of the core?
> > I thought Intel, but you should tell us if you have broader plans.  
> 
> For now it's done as a lib inside Intel folder, BUT if any other vendor
> would like to use this, that would be great and then we could move it
> level up or some of these APIs can go into the core.
> IOW depends on users.
> 
> libie in contrary contains HW-specific code and will always be
> Intel-specific.

Seems like an odd middle ground. If you believe it's generic finding
another driver shouldn't be hard.

> >> +	const struct libeth_netdev_priv *priv = netdev_priv(dev);	      \
> >> +									      \
> >> +	memset(stats, 0, sizeof(*stats));				      \
> >> +	u64_stats_init(&stats->syncp);					      \
> >> +									      \
> >> +	mutex_init(&priv->base_##pfx##s[qid].lock);			      \  
> > 
> > the mutex is only for writes or for reads of base too?
> > mutex is a bad idea for rtnl stats  
> 
> Base stats are written only on ifdown, read anytime, mutex is used
> everywhere.
> Hmm maybe a bad idea, what would be better, spinlock or just use
> u64_sync as well?

Depends quite a bit on whether driver uses per queues stats to get 
rtnl stats. And whether reading of the stats needs to sleep.

> >> +#define LIBETH_STATS_DEFINE_EXPORT(pfx, gpfx)				      \
> >> +static void								      \
> >> +libeth_get_queue_stats_##gpfx(struct net_device *dev, int idx,		      \
> >> +			      struct netdev_queue_stats_##gpfx *stats)	      \
> >> +{									      \
> >> +	const struct libeth_netdev_priv *priv = netdev_priv(dev);	      \
> >> +	const struct libeth_##pfx##_stats *qs;				      \
> >> +	u64 *raw = (u64 *)stats;					      \
> >> +	u32 start;							      \
> >> +									      \
> >> +	qs = READ_ONCE(priv->live_##pfx##s[idx]);			      \
> >> +	if (!qs)							      \
> >> +		return;							      \
> >> +									      \
> >> +	do {								      \
> >> +		start = u64_stats_fetch_begin(&qs->syncp);		      \
> >> +									      \
> >> +		libeth_stats_foreach_export(pfx, exp)			      \
> >> +			raw[exp->gi] = u64_stats_read(&qs->raw[exp->li]);     \
> >> +	} while (u64_stats_fetch_retry(&qs->syncp, start));		      \
> >> +}									      \  
> > 
> > ugh. Please no  
> 
> So you mean just open-code reads/writes per each field than to compress
> it that way?

Yes. <rant> I don't understand why people try to be clever and
complicate stats reading for minor LoC saving (almost everyone,
including those working on fbnic). Just type the code in -- it 
makes maintaining it, grepping and adding a new stat without
remembering all the details soo much easier. </rant>

> Sure, that would be no problem. Object code doesn't even
> change (my first approach was per-field).
> 
> >> +									      \
> >> +static void								      \
> >> +libeth_get_##pfx##_base_stats(const struct net_device *dev,		      \
> >> +			      struct netdev_queue_stats_##gpfx *stats)	      \
> >> +{									      \
> >> +	const struct libeth_netdev_priv *priv = netdev_priv(dev);	      \
> >> +	u64 *raw = (u64 *)stats;					      \
> >> +									      \
> >> +	memset(stats, 0, sizeof(*(stats)));				      \  
> > 
> > Have you read the docs for any of the recent stats APIs?  
> 
> You mean to leave 0xffs for unsupported fields?

Kinda of. But also I do mean to call out that you haven't read the doc
for the interface over which you're building an abstraction 😵‍💫️

> > Nack. Just implement the APIs in the driver, this does not seem like 
> > a sane starting point _at all_. You're going to waste more time coming
> > up with such abstraction than you'd save implementing it for 10 drivers.  
> 
> I believe this nack is for generic Netlink stats, not the whole, right?
> In general, I wasn't sure about whether it would be better to leave
> Netlink stats per driver or write it in libeth, so I wanted to see
> opinions of others. I'm fine with either way.

We (I?) keep pushing more and more stats into the generic definitions,
mostly as I find clear need for them in Meta's monitoring system.
My main concern is that if you hide the stats collecting in a library
it will make ensuring the consistency of the definition much harder,
and it will propagate the use of old APIs (dreaded ethtool -S) into new
drivers.

If you have useful helpers that can be broadly applicable that's great.
This library as it stands will need a lot of work and a lot of
convincing to go in.