[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240912100738.16567-1-jdamato@fastly.com>
Date: Thu, 12 Sep 2024 10:07:08 +0000
From: Joe Damato <jdamato@...tly.com>
To: netdev@...r.kernel.org
Cc: mkarsten@...terloo.ca,
kuba@...nel.org,
skhawaja@...gle.com,
sdf@...ichev.me,
bjorn@...osinc.com,
amritha.nambiar@...el.com,
sridhar.samudrala@...el.com,
Joe Damato <jdamato@...tly.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Breno Leitao <leitao@...ian.org>,
Daniel Jurgens <danielj@...dia.com>,
David Ahern <dsahern@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Donald Hunter <donald.hunter@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Jiri Pirko <jiri@...nulli.us>,
Johannes Berg <johannes.berg@...el.com>,
Jonathan Corbet <corbet@....net>,
Kory Maincent <kory.maincent@...tlin.com>,
Leon Romanovsky <leon@...nel.org>,
linux-doc@...r.kernel.org (open list:DOCUMENTATION),
linux-kernel@...r.kernel.org (open list),
linux-rdma@...r.kernel.org (open list:MELLANOX MLX4 core VPI driver),
Lorenzo Bianconi <lorenzo@...nel.org>,
Michael Chan <michael.chan@...adcom.com>,
Paolo Abeni <pabeni@...hat.com>,
Saeed Mahameed <saeedm@...dia.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Tariq Toukan <tariqt@...dia.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: [RFC net-next v3 0/9] Add support for per-NAPI config via netlink
Greetings:
Welcome to RFC v3.
This implementation allocates an array of "struct napi_config" in
net_device and each NAPI instance is assigned an index into the config
array.
Per-NAPI settings like:
- NAPI ID
- gro_flush_timeout
- defer_hard_irqs
are persisted in napi_config and restored on napi_disable/napi_enable
respectively.
To help illustrate how this would end up working, I've added patches for
3 drivers, of which I have access to only 1:
- mlx5 which is the basis of the examples below
- mlx4 which has TX only NAPIs, just to highlight that case. I have
only compile tested this patch; I don't have this hardware.
- bnxt which I have only compiled tested. I don't have this
hardware.
NOTE: I only tested this on mlx5; I have no access to the other hardware
for which I provided patches. Hopefully other folks can help test :)
This iteration seems to persist NAPI IDs and settings even when resizing
queues, see below, so I think maybe this is getting close to where we
want to land?
Here's an example of how it works on my mlx5:
# start with 2 queues
$ ethtool -l eth4 | grep Combined | tail -1
Combined: 2
First, output the current NAPI settings:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 344,
'ifindex': 7,
'irq': 327}]
Now, set the global sysfs parameters:
$ sudo bash -c 'echo 20000 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 100 >/sys/class/net/eth4/napi_defer_hard_irqs'
Output current NAPI settings again:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Now set NAPI ID 345, via its NAPI ID to specific values:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set \
--json='{"id": 345,
"defer-hard-irqs": 111,
"gro-flush-timeout": 11111}'
None
Now output current NAPI settings again to ensure only NAPI ID 345
changed:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 111,
'gro-flush-timeout': 11111,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Now, increase gro-flush-timeout only:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 345,
"gro-flush-timeout": 44444}'
None
Now output the current NAPI settings once more:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 111,
'gro-flush-timeout': 44444,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Now set NAPI ID 345 to have gro_flush_timeout of 0:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 345,
"gro-flush-timeout": 0}'
None
Check that NAPI ID 345 has a value of 0:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 111,
'gro-flush-timeout': 0,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Change the queue count, ensuring that NAPI ID 345 retains its settings:
$ sudo ethtool -L eth4 combined 4
Check that the new queues have the system wide settings but that NAPI ID
345 remains unchanged:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 347,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 346,
'ifindex': 7,
'irq': 528},
{'defer-hard-irqs': 111,
'gro-flush-timeout': 0,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Now reduce the queue count below where NAPI ID 345 is indexed:
$ sudo ethtool -L eth4 combined 1
Check the output:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Re-increase the queue count to ensure NAPI ID 345 is re-assigned the same
values:
$ sudo ethtool -L eth4 combined 2
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 111,
'gro-flush-timeout': 0,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Create new queues to ensure the sysfs globals are used for the new NAPIs
but that NAPI ID 345 is unchanged:
$ sudo ethtool -L eth4 comabined 8
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[...]
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 346,
'ifindex': 7,
'irq': 528},
{'defer-hard-irqs': 111,
'gro-flush-timeout': 0,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 344,
'ifindex': 7,
'irq': 327}]
Last, but not least, let's try writing the sysfs parameters to ensure
all NAPIs are rewritten:
$ sudo bash -c 'echo 33333 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 222 >/sys/class/net/eth4/napi_defer_hard_irqs'
Check that worked:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[...]
{'defer-hard-irqs': 222,
'gro-flush-timeout': 33333,
'id': 346,
'ifindex': 7,
'irq': 528},
{'defer-hard-irqs': 222,
'gro-flush-timeout': 33333,
'id': 345,
'ifindex': 7,
'irq': 527},
{'defer-hard-irqs': 222,
'gro-flush-timeout': 33333,
'id': 344,
'ifindex': 7,
'irq': 327}]
Thanks,
Joe
rfcv3:
- Renamed napi_storage to napi_config
- Reordered patches
- Added defer_hard_irqs and gro_flush_timeout to napi_struct
- Attempt to save and restore settings on napi_disable/napi_enable
- Removed weight as a parameter to netif_napi_add_storage
- Updated driver patches to no longer pass in weight
rfcv2:
- Almost total rewrite from v1
Joe Damato (9):
net: napi: Make napi_defer_hard_irqs per-NAPI
netdev-genl: Dump napi_defer_hard_irqs
net: napi: Make gro_flush_timeout per-NAPI
netdev-genl: Dump gro_flush_timeout
net: napi: Add napi_config
netdev-genl: Support setting per-NAPI config values
bnxt: Add support for napi storage
mlx5: Add support for napi storage
mlx4: Add support for napi storage to RX CQs
Documentation/netlink/specs/netdev.yaml | 25 ++++++
.../networking/net_cachelines/net_device.rst | 5 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +-
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 3 +-
.../net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
include/linux/netdevice.h | 40 ++++++++-
include/uapi/linux/netdev.h | 3 +
net/core/dev.c | 90 +++++++++++++++----
net/core/dev.h | 87 ++++++++++++++++++
net/core/net-sysfs.c | 4 +-
net/core/netdev-genl-gen.c | 14 +++
net/core/netdev-genl-gen.h | 1 +
net/core/netdev-genl.c | 55 ++++++++++++
tools/include/uapi/linux/netdev.h | 3 +
14 files changed, 310 insertions(+), 25 deletions(-)
--
2.25.1
Powered by blists - more mailing lists