netdev - RE: [PATCH RFC net-next 0/4] Adjust page pool netlink filling to non common case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID:
 <BL1PR12MB5922812514E3484191A929CBCBD52@BL1PR12MB5922.namprd12.prod.outlook.com>
Date: Tue, 25 Jun 2024 15:37:53 +0000
From: Amit Cohen <amcohen@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: "davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
	<edumazet@...gle.com>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"hawk@...nel.org" <hawk@...nel.org>, Ido Schimmel <idosch@...dia.com>, Petr
 Machata <petrm@...dia.com>, mlxsw <mlxsw@...dia.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH RFC net-next 0/4] Adjust page pool netlink filling to non
 common case



> -----Original Message-----
> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Tuesday, 25 June 2024 17:35
> To: Amit Cohen <amcohen@...dia.com>
> Cc: davem@...emloft.net; edumazet@...gle.com; pabeni@...hat.com; hawk@...nel.org; Ido Schimmel <idosch@...dia.com>; Petr
> Machata <petrm@...dia.com>; mlxsw <mlxsw@...dia.com>; netdev@...r.kernel.org
> Subject: Re: [PATCH RFC net-next 0/4] Adjust page pool netlink filling to non common case
> 
> On Tue, 25 Jun 2024 15:08:03 +0300 Amit Cohen wrote:
> > Most network drivers has 1:1 mapping between netdevice and event queues,
> > so then each page pool is used by only one netdevice. This is not the case
> > in mlxsw driver.
> >
> > Currently, the netlink message is filled with 'pool->slow.netdev->ifindex',
> > which should be NULL in case that several netdevices use the same pool.
> > Adjust page pool netlink filling to use the netdevice which the pool is
> > stored in its list. See more info in commit messages.
> >
> > Without this set, mlxsw driver cannot dump all page pools:
> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
> > 	--dump page-pool-stats-get --output-json | jq
> > []
> >
> > With this set, "dump" command prints all the page pools for all the
> > netdevices:
> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
> > 	--dump page-pool-get --output-json | \
> > 	jq -e ".[] | select(.ifindex == 64)" | grep "napi-id" | wc -l
> > 56
> >
> > From driver POV, such queries are supported by associating the pools with
> > an unregistered netdevice (dummy netdevice). The following limitations
> > are caused by such implementation:
> > 1. The get command output specifies the 'ifindex' as 0, which is
> > meaningless. `iproute2` will print this as "*", but there might be other
> > tools which fail in such case.
> > 2. get command does not work when devlink instance is reloaded to namespace
> > which is not the initial one, as the dummy device associated with the pools
> > belongs to the initial namespace.
> > See examples in commit messages.
> >
> > We would like to expose page pool stats and info via the standard
> > interface, but such implementation is not perfect. An additional option
> > is to use debugfs, but we prefer to avoid it, if it is possible. Any
> > suggestions for better implementation in case of pool for several
> > netdevices will be welcomed.
> 
> If I read the code correctly you dump all page pools for all port
> netdevs?

Yes.

 Primary use for page pool stats right now is to measure
> how much memory have netdevs gobbled up. You can't duplicate entries,
> because user space may double count the memory...
> 
> How about we instead add a net pointer and have the page pools listed
> under loopback from the start? That's the best we can do I reckon.
> Or just go with debugfs / ethtool -S, the standard interface is for
> things which are standard. If the device doesn't work in a standard way
> there's no need to shoehorn it in. This series feels a bit like checkbox
> engineering to me, if I'm completely honest..


Thanks for your feedback, I sent it to consult if you have better suggestions for this case.
I agree that if the device is not standard, we should not use the standard interface when the solution is not perfect.
I think that for now we won't expose statistics, if we will find it necessary, we will probably use debugfs for that.
Thanks!