[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aArFm-TS3Ac0FOic@LQ3V64L9R2>
Date: Thu, 24 Apr 2025 16:13:31 -0700
From: Joe Damato <jdamato@...tly.com>
To: Samiullah Khawaja <skhawaja@...gle.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
"David S . Miller " <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
almasrymina@...gle.com, willemb@...gle.com, mkarsten@...terloo.ca,
netdev@...r.kernel.org
Subject: Re: [PATCH net-next v5] Add support to set napi threaded for
individual napi
On Wed, Apr 23, 2025 at 08:14:13PM +0000, Samiullah Khawaja wrote:
> A net device has a threaded sysctl that can be used to enable threaded
> napi polling on all of the NAPI contexts under that device. Allow
> enabling threaded napi polling at individual napi level using netlink.
>
> Extend the netlink operation `napi-set` and allow setting the threaded
> attribute of a NAPI. This will enable the threaded polling on a napi
> context.
>
> Add a test in `nl_netdev.py` that verifies various cases of threaded
> napi being set at napi and at device level.
>
> Tested
> ./tools/testing/selftests/net/nl_netdev.py
> TAP version 13
> 1..6
> ok 1 nl_netdev.empty_check
> ok 2 nl_netdev.lo_check
> ok 3 nl_netdev.page_pool_check
> ok 4 nl_netdev.napi_list_check
> ok 5 nl_netdev.napi_set_threaded
> ok 6 nl_netdev.nsim_rxq_reset_down
> # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
>
> v5:
> - This patch was part of:
> https://lore.kernel.org/netdev/Z92e2kCYXQ_RsrJh@LQ3V64L9R2/T/
> It is being sent separately for the first time.
Thanks; I think this is a good change on its own separate from the
rest of the series and, IMO, I think it makes it easier to get
reviewed and merged.
Probably a matter of personal preference, which you can feel free to
ignore, but IMO I think splitting this into 3 patches might make it
easier to get Reviewed-bys and make changes ?
- core infrastructure changes
- netlink related changes (and docs)
- selftest
Then if feedback comes in for some parts, but not others, it makes
it easier for reviewers in the future to know what was already
reviewed and what was changed ?
Just my 2 cents.
The above said, my high level feedback is that I don't think this
addresses the concerns we discussed in the v4 about device-wide
setting vs per-NAPI settings.
See below.
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d0563ddff6ca..ea3c8a30bb97 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6888,6 +6888,27 @@ static enum hrtimer_restart napi_watchdog(struct hrtimer *timer)
> return HRTIMER_NORESTART;
> }
>
> +int napi_set_threaded(struct napi_struct *napi, bool threaded)
> +{
> + if (threaded) {
> + if (!napi->thread) {
> + int err = napi_kthread_create(napi);
> +
> + if (err)
> + return err;
> + }
> + }
> +
> + if (napi->config)
> + napi->config->threaded = threaded;
> +
> + /* Make sure kthread is created before THREADED bit is set. */
> + smp_mb__before_atomic();
> + assign_bit(NAPI_STATE_THREADED, &napi->state, threaded);
> +
> + return 0;
> +}
Hm, I wonder if dev_set_threaded can be cleaned up to use this
instead of repeating similar code in two places?
> +
> int dev_set_threaded(struct net_device *dev, bool threaded)
> {
> struct napi_struct *napi;
> @@ -7144,6 +7165,8 @@ static void napi_restore_config(struct napi_struct *n)
> napi_hash_add(n);
> n->config->napi_id = n->napi_id;
> }
> +
> + napi_set_threaded(n, n->config->threaded);
It makes sense to me that when restoring the config, the kthread is
kicked off again (assuming config->thread > 0), but does the
napi_save_config path need to stop the thread?
Not sure if kthread_stop is hit via some other path when
napi_disable is called? Can you clarify?
>From my testing, it seems like setting threaded: 0 using the CLI
leaves the thread running. Should it work that way? From a high
level that behavior seems somewhat unexpected, don't you think?
[...]
> static void napi_save_config(struct napi_struct *n)
> diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
> index b64c614a00c4..f7d000a600cf 100644
> --- a/net/core/netdev-genl.c
> +++ b/net/core/netdev-genl.c
> @@ -184,6 +184,10 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi,
> if (napi->irq >= 0 && nla_put_u32(rsp, NETDEV_A_NAPI_IRQ, napi->irq))
> goto nla_put_failure;
>
> + if (nla_put_uint(rsp, NETDEV_A_NAPI_THREADED,
> + napi_get_threaded(napi)))
> + goto nla_put_failure;
> +
> if (napi->thread) {
> pid = task_pid_nr(napi->thread);
> if (nla_put_u32(rsp, NETDEV_A_NAPI_PID, pid))
> @@ -322,8 +326,14 @@ netdev_nl_napi_set_config(struct napi_struct *napi, struct genl_info *info)
> {
> u64 irq_suspend_timeout = 0;
> u64 gro_flush_timeout = 0;
> + uint threaded = 0;
I think I'd probably make this a u8 or something? uint is not a
commonly used type, AFAICT, and seems out of place here.
> diff --git a/tools/testing/selftests/net/nl_netdev.py b/tools/testing/selftests/net/nl_netdev.py
> index beaee5e4e2aa..505564818fa8 100755
> --- a/tools/testing/selftests/net/nl_netdev.py
> +++ b/tools/testing/selftests/net/nl_netdev.py
> @@ -2,6 +2,7 @@
> # SPDX-License-Identifier: GPL-2.0
>
> import time
> +from os import system
> from lib.py import ksft_run, ksft_exit, ksft_pr
> from lib.py import ksft_eq, ksft_ge, ksft_busy_wait
> from lib.py import NetdevFamily, NetdevSimDev, ip
> @@ -34,6 +35,70 @@ def napi_list_check(nf) -> None:
> ksft_eq(len(napis), 100,
> comment=f"queue count after reset queue {q} mode {i}")
>
> +def napi_set_threaded(nf) -> None:
> + """
> + Test that verifies various cases of napi threaded
> + set and unset at napi and device level.
> + """
> + with NetdevSimDev(queue_count=2) as nsimdev:
> + nsim = nsimdev.nsims[0]
> +
> + ip(f"link set dev {nsim.ifname} up")
> +
> + napis = nf.napi_get({'ifindex': nsim.ifindex}, dump=True)
> + ksft_eq(len(napis), 2)
> +
> + napi0_id = napis[0]['id']
> + napi1_id = napis[1]['id']
> +
> + # set napi threaded and verify
> + nf.napi_set({'id': napi0_id, 'threaded': 1})
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 1)
> +
> + ip(f"link set dev {nsim.ifname} down")
> + ip(f"link set dev {nsim.ifname} up")
> +
> + # verify if napi threaded is still set
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 1)
I feel like the test needs to be expanded?
It's not just the attribute that matters (although that bit is
important), but I think it probably is important that the kthread is
still running and that should be checked ?
> + # unset napi threaded and verify
> + nf.napi_set({'id': napi0_id, 'threaded': 0})
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 0)
> +
> + # set napi threaded for napi0
> + nf.napi_set({'id': napi0_id, 'threaded': 1})
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 1)
> +
> + # check it is not set for napi1
> + napi1 = nf.napi_get({'id': napi1_id})
> + ksft_eq(napi1['threaded'], 0)
> +
> + # set threaded at device level
> + system(f"echo 1 > /sys/class/net/{nsim.ifname}/threaded")
> +
> + # check napi threaded is set for both napis
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 1)
> + napi1 = nf.napi_get({'id': napi1_id})
> + ksft_eq(napi1['threaded'], 1)
> +
> + # set napi threaded for napi0
> + nf.napi_set({'id': napi0_id, 'threaded': 1})
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 1)
If napi0 is already set to threaded in the previous block for the
device level, why set it explicitly again ?
> +
> + # unset threaded at device level
> + system(f"echo 0 > /sys/class/net/{nsim.ifname}/threaded")
> +
> + # check napi threaded is unset for both napis
> + napi0 = nf.napi_get({'id': napi0_id})
> + ksft_eq(napi0['threaded'], 0)
> + napi1 = nf.napi_get({'id': napi1_id})
> + ksft_eq(napi1['threaded'], 0)
I ran the test and it passes for me.
That said, the test is incomplete or buggy because I've manually
identified 1 case that is incorrect which we discussed in the v4 and
a second case that seems buggy from a user perspective.
Case 1 (we discussed this in the v4, but seems like it was missed
here?):
Threaded set to 1 and then to 0 at the device level
echo 1 > /sys/class/net/eni28160np1/threaded
echo 0 > /sys/class/net/eni28160np1/threaded
Check the setting:
cat /sys/class/net/eni28160np1/threaded
0
Dump the settings for netdevsim, noting that threaded is 0, but pid
is set (again, should it be?):
./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 20}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 612,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15728,
'threaded': 0},
{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 611,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15729,
'threaded': 0}]
Now set NAPI 612 to threaded 1:
./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 612, "threaded": 1}'
Dump the settings:
./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 20}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 612,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15728,
'threaded': 1},
{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 611,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15729,
'threaded': 0}]
So far, so good.
Now set device-wide threaded to 0, which should set NAPI 612 to threaded 0:
echo 0 > /sys/class/net/eni28160np1/threaded
Dump settings:
./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 20}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 612,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15728,
'threaded': 1},
{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 611,
'ifindex': 20,
'irq-suspend-timeout': 0,
'pid': 15729,
'threaded': 0}]
Note that NAPI 612 is still set to threaded 1, but we set the
device-wide setting to 0.
Shouldn't NAPI 612 be threaded = 0 ?
Second case:
- netdevsim is brought up with 2 queues and 2 napis
- Threaded NAPI is enabled for napi1
- The queue count on the device is reduced from 2 to 1 (therefore
napi1 is disabled) via ethtool -L
Then:
- Is the thread still active? Should it be?
IMO: if the napi is disabled the thread should stop because there is
no NAPI for it to poll. When the napi is enabled, the thread can
start back up.
It does not seem that this is currently the case from manual
testing.
Powered by blists - more mailing lists