lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240829131214.169977-1-jdamato@fastly.com>
Date: Thu, 29 Aug 2024 13:11:56 +0000
From: Joe Damato <jdamato@...tly.com>
To: netdev@...r.kernel.org
Cc: edumazet@...gle.com,
	amritha.nambiar@...el.com,
	sridhar.samudrala@...el.com,
	sdf@...ichev.me,
	bjorn@...osinc.com,
	hch@...radead.org,
	willy@...radead.org,
	willemdebruijn.kernel@...il.com,
	skhawaja@...gle.com,
	kuba@...nel.org,
	Joe Damato <jdamato@...tly.com>,
	Alexander Lobakin <aleksander.lobakin@...el.com>,
	Breno Leitao <leitao@...ian.org>,
	Daniel Jurgens <danielj@...dia.com>,
	"David S. Miller" <davem@...emloft.net>,
	Donald Hunter <donald.hunter@...il.com>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Jiri Pirko <jiri@...nulli.us>,
	Johannes Berg <johannes.berg@...el.com>,
	linux-kernel@...r.kernel.org (open list),
	Lorenzo Bianconi <lorenzo@...nel.org>,
	Martin Karsten <mkarsten@...terloo.ca>,
	Paolo Abeni <pabeni@...hat.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: [PATCH net-next 0/5] Add support for per-NAPI config via netlink

Greetings:

This makes gro_flush_timeout and napi_defer_hard_irqs available per NAPI
instance, in addition to the existing sysfs parameter. The existing
sysfs parameters remain and care was taken to support them, but an
important edge case was introduced, described below.

The netdev netlink spec has been updated to export both parameters when
doing a napi-get operation and a new operation, napi-set, has been added
to set the parameters. The parameters can be set individually or
together. The idea is that user apps might want to update, for example,
gro_flush_timeout dynamically during busy poll, but maybe the app is
happy with the existing defer_hard_irqs value.

The intention is that if this is accepted, it will be expanded to
support the suspend parameter proposed in a recent series [1].

Important edge case introduced:

In order to keep the existing sysfs parameters working as intended and
also support per NAPI settings an important change was made:
  - Writing the sysfs parameters writes both to the net_device level
    field and to the per-NAPI fields for every NAPI associated with the
    net device. This was done as the intention of writing to sysfs seems
    to be that it takes effect globally, for all NAPIs.
  - Reading the sysfs parameter reads the net_device level field.
  - It is technically possible for a user to do the following:
    - Write a value to a sysfs param, which in turn sets all NAPIs to
      that value
    - Using the netlink API, write a new value to every NAPI on the
      system
    - Print the sysfs param

The printing of the param will reveal a value that is no longer in use
by any NAPI, but is used for any newly created NAPIs (e.g. if new queues
are created).

It's tempting to think that the implementation could be something as
simple as (psuedocode):

   if (!napi->gro_flush_timeout)
     return dev->gro_flush_timeout;

To avoid the complexity of writing values to every NAPI, but this
approach does not work if the user wants the gro_flush_timeout to be 0
for a specific NAPI while having it set to non-zero for the rest of the
system.

Here's a walk through of some common commands to illustrate how one
might use this:

First, output the current NAPI settings:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 0,
  'gro-flush-timeout': 0,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 0,
  'gro-flush-timeout': 0,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
 [...]

Now, set the global sysfs parameters:

$ sudo bash -c 'echo 20000 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 100 >/sys/class/net/eth4/napi_defer_hard_irqs' 

Output current NAPI settings again:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
  'gro-flush-timeout': 20000,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 100,
  'gro-flush-timeout': 20000,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
 [...]

Now set NAPI ID 913 to specific values:

$ sudo ./tools/net/ynl/cli.py \
             --spec Documentation/netlink/specs/netdev.yaml \
             --do napi-set \
             --json='{"id": 913, "defer-hard-irqs": 111,
                      "gro-flush-timeout": 11111}'
None

Now output current NAPI settings again to ensure only 913 changed:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
  'gro-flush-timeout': 20000,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 111,
  'gro-flush-timeout': 11111,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
[...]

Now, increase gro-flush-timeout only:

$ sudo ./tools/net/ynl/cli.py \
       --spec Documentation/netlink/specs/netdev.yaml \
       --do napi-set --json='{"id": 913, "gro-flush-timeout": 44444}'
None

Now output the current NAPI settings once more:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
  'gro-flush-timeout': 20000,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 111,
  'gro-flush-timeout': 44444,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
[...]

Now set NAPI ID 913 to have gro_flush_timeout of 0:

$ sudo ./tools/net/ynl/cli.py \
       --spec Documentation/netlink/specs/netdev.yaml \
       --do napi-set --json='{"id": 913, "gro-flush-timeout": 0}'
None

Check that NAPI ID 913 has a value of 0:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                         --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 100,
  'gro-flush-timeout': 20000,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 111,
  'gro-flush-timeout': 0,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
[...]

Last, but not least, let's try writing the sysfs parameters to ensure
all NAPIs are rewritten:

$ sudo bash -c 'echo 33333 >/sys/class/net/eth4/gro_flush_timeout'
$ sudo bash -c 'echo 222 >/sys/class/net/eth4/napi_defer_hard_irqs' 

Check that worked:

$ $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                           --dump napi-get --json='{"ifindex": 7}'
[{'defer-hard-irqs': 222,
  'gro-flush-timeout': 33333,
  'id': 914,
  'ifindex': 7,
  'irq': 529},
 {'defer-hard-irqs': 222,
  'gro-flush-timeout': 33333,
  'id': 913,
  'ifindex': 7,
  'irq': 528},
[...]

Thanks,
Joe

[1]: https://lore.kernel.org/lkml/20240823173103.94978-1-jdamato@fastly.com/

Joe Damato (5):
  net: napi: Make napi_defer_hard_irqs per-NAPI
  netdev-genl: Dump napi_defer_hard_irqs
  net: napi: Make gro_flush_timeout per-NAPI
  netdev-genl: Dump gro_flush_timeout
  netdev-genl: Support setting per-NAPI config values

 Documentation/netlink/specs/netdev.yaml | 23 ++++++++++
 include/linux/netdevice.h               | 49 ++++++++++++++++++++
 include/uapi/linux/netdev.h             |  3 ++
 net/core/dev.c                          | 61 ++++++++++++++++++++++---
 net/core/net-sysfs.c                    |  7 ++-
 net/core/netdev-genl-gen.c              | 14 ++++++
 net/core/netdev-genl-gen.h              |  1 +
 net/core/netdev-genl.c                  | 56 +++++++++++++++++++++++
 tools/include/uapi/linux/netdev.h       |  3 ++
 9 files changed, 208 insertions(+), 9 deletions(-)

-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ