[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240829131214.169977-1-jdamato@fastly.com>
Date: Thu, 29 Aug 2024 13:11:56 +0000
From: Joe Damato <jdamato@...tly.com>
To: netdev@...r.kernel.org
Cc: edumazet@...gle.com,
amritha.nambiar@...el.com,
sridhar.samudrala@...el.com,
sdf@...ichev.me,
bjorn@...osinc.com,
hch@...radead.org,
willy@...radead.org,
willemdebruijn.kernel@...il.com,
skhawaja@...gle.com,
kuba@...nel.org,
Joe Damato <jdamato@...tly.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Breno Leitao <leitao@...ian.org>,
Daniel Jurgens <danielj@...dia.com>,
"David S. Miller" <davem@...emloft.net>,
Donald Hunter <donald.hunter@...il.com>,
Heiner Kallweit <hkallweit1@...il.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Jiri Pirko <jiri@...nulli.us>,
Johannes Berg <johannes.berg@...el.com>,
linux-kernel@...r.kernel.org (open list),
Lorenzo Bianconi <lorenzo@...nel.org>,
Martin Karsten <mkarsten@...terloo.ca>,
Paolo Abeni <pabeni@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: [PATCH net-next 0/5] Add support for per-NAPI config via netlink
Greetings:
This makes gro_flush_timeout and napi_defer_hard_irqs available per NAPI
instance, in addition to the existing sysfs parameter. The existing
sysfs parameters remain and care was taken to support them, but an
important edge case was introduced, described below.
The netdev netlink spec has been updated to export both parameters when
doing a napi-get operation and a new operation, napi-set, has been added
to set the parameters. The parameters can be set individually or
together. The idea is that user apps might want to update, for example,
gro_flush_timeout dynamically during busy poll, but maybe the app is
happy with the existing defer_hard_irqs value.
The intention is that if this is accepted, it will be expanded to
support the suspend parameter proposed in a recent series [1].
Important edge case introduced:
In order to keep the existing sysfs parameters working as intended and
also support per NAPI settings an important change was made:
- Writing the sysfs parameters writes both to the net_device level
field and to the per-NAPI fields for every NAPI associated with the
net device. This was done as the intention of writing to sysfs seems
to be that it takes effect globally, for all NAPIs.
- Reading the sysfs parameter reads the net_device level field.
- It is technically possible for a user to do the following:
- Write a value to a sysfs param, which in turn sets all NAPIs to
that value
- Using the netlink API, write a new value to every NAPI on the
system
- Print the sysfs param
The printing of the param will reveal a value that is no longer in use
by any NAPI, but is used for any newly created NAPIs (e.g. if new queues
are created).
It's tempting to think that the implementation could be something as
simple as (psuedocode):
if (!napi->gro_flush_timeout)
return dev->gro_flush_timeout;
To avoid the complexity of writing values to every NAPI, but this
approach does not work if the user wants the gro_flush_timeout to be 0
for a specific NAPI while having it set to non-zero for the rest of the
system.
Here's a walk through of some common commands to illustrate how one
might use this:
First, output the current NAPI settings:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Now, set the global sysfs parameters:
$ sudo bash -c 'echo 20000 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 100 >/sys/class/net/eth4/napi_defer_hard_irqs'
Output current NAPI settings again:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Now set NAPI ID 913 to specific values:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set \
--json='{"id": 913, "defer-hard-irqs": 111,
"gro-flush-timeout": 11111}'
None
Now output current NAPI settings again to ensure only 913 changed:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 111,
'gro-flush-timeout': 11111,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Now, increase gro-flush-timeout only:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 913, "gro-flush-timeout": 44444}'
None
Now output the current NAPI settings once more:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 111,
'gro-flush-timeout': 44444,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Now set NAPI ID 913 to have gro_flush_timeout of 0:
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 913, "gro-flush-timeout": 0}'
None
Check that NAPI ID 913 has a value of 0:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
'gro-flush-timeout': 20000,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 111,
'gro-flush-timeout': 0,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Last, but not least, let's try writing the sysfs parameters to ensure
all NAPIs are rewritten:
$ sudo bash -c 'echo 33333 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 222 >/sys/class/net/eth4/napi_defer_hard_irqs'
Check that worked:
$ $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 222,
'gro-flush-timeout': 33333,
'id': 914,
'ifindex': 7,
'irq': 529},
{'defer-hard-irqs': 222,
'gro-flush-timeout': 33333,
'id': 913,
'ifindex': 7,
'irq': 528},
[...]
Thanks,
Joe
[1]: https://lore.kernel.org/lkml/20240823173103.94978-1-jdamato@fastly.com/
Joe Damato (5):
net: napi: Make napi_defer_hard_irqs per-NAPI
netdev-genl: Dump napi_defer_hard_irqs
net: napi: Make gro_flush_timeout per-NAPI
netdev-genl: Dump gro_flush_timeout
netdev-genl: Support setting per-NAPI config values
Documentation/netlink/specs/netdev.yaml | 23 ++++++++++
include/linux/netdevice.h | 49 ++++++++++++++++++++
include/uapi/linux/netdev.h | 3 ++
net/core/dev.c | 61 ++++++++++++++++++++++---
net/core/net-sysfs.c | 7 ++-
net/core/netdev-genl-gen.c | 14 ++++++
net/core/netdev-genl-gen.h | 1 +
net/core/netdev-genl.c | 56 +++++++++++++++++++++++
tools/include/uapi/linux/netdev.h | 3 ++
9 files changed, 208 insertions(+), 9 deletions(-)
--
2.25.1
Powered by blists - more mailing lists