[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a2768226-854e-464d-8e76-240f7c76e987@intel.com>
Date: Wed, 9 Apr 2025 22:23:28 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Jakub Kicinski <kuba@...nel.org>, <davem@...emloft.net>
CC: <netdev@...r.kernel.org>, <edumazet@...gle.com>, <pabeni@...hat.com>,
<andrew+netdev@...n.ch>, <horms@...nel.org>, <sdf@...ichev.me>,
<hramamurthy@...gle.com>, <kuniyu@...zon.com>, <jdamato@...tly.com>
Subject: Re: [PATCH net-next v2 8/8] netdev: depend on netdev->lock for qstats
in ops locked drivers
On 4/8/2025 12:59 PM, Jakub Kicinski wrote:
> We mostly needed rtnl_lock in qstat to make sure the queue count
> is stable while we work. For "ops locked" drivers the instance
> lock protects the queue count, so we don't have to take rtnl_lock.
>
> For currently ops-locked drivers: netdevsim and bnxt need
> the protection from netdev going down while we dump, which
> instance lock provides. gve doesn't care.
>
> Reviewed-by: Joe Damato <jdamato@...tly.com>
> Acked-by: Stanislav Fomichev <sdf@...ichev.me>
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> Documentation/networking/netdevices.rst | 6 +++++
> include/net/netdev_queues.h | 4 +++-
> net/core/netdev-genl.c | 29 +++++++++++++++----------
> 3 files changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
> index 7ae28c5fb835..0ccc7dcf4390 100644
> --- a/Documentation/networking/netdevices.rst
> +++ b/Documentation/networking/netdevices.rst
> @@ -356,6 +356,12 @@ Similarly to ``ndos`` the instance lock is only held for select drivers.
> For "ops locked" drivers all ethtool ops without exceptions should
> be called under the instance lock.
>
> +struct netdev_stat_ops
> +----------------------
> +
> +"qstat" ops are invoked under the instance lock for "ops locked" drivers,
> +and under rtnl_lock for all other drivers.
> +
> struct net_shaper_ops
> ---------------------
>
What determines if a driver is "ops locked"? Is that defined above this
chunk in the doc? I see its when netdev_need_ops_lock() is set? Ok.
Sounds like it would be good to start migrating drivers over to this
locking paradigm over time.
> diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> index 825141d675e5..ea709b59d827 100644
> --- a/include/net/netdev_queues.h
> +++ b/include/net/netdev_queues.h
> @@ -85,9 +85,11 @@ struct netdev_queue_stats_tx {
> * for some of the events is not maintained, and reliable "total" cannot
> * be provided).
> *
> + * Ops are called under the instance lock if netdev_need_ops_lock()
> + * returns true, otherwise under rtnl_lock.
> * Device drivers can assume that when collecting total device stats,
> * the @get_base_stats and subsequent per-queue calls are performed
> - * "atomically" (without releasing the rtnl_lock).
> + * "atomically" (without releasing the relevant lock).
> *
> * Device drivers are encouraged to reset the per-queue statistics when
> * number of queues change. This is because the primary use case for
> diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
> index 8c58261de969..b64c614a00c4 100644
> --- a/net/core/netdev-genl.c
> +++ b/net/core/netdev-genl.c
> @@ -795,26 +795,31 @@ int netdev_nl_qstats_get_dumpit(struct sk_buff *skb,
> if (info->attrs[NETDEV_A_QSTATS_IFINDEX])
> ifindex = nla_get_u32(info->attrs[NETDEV_A_QSTATS_IFINDEX]);
>
> - rtnl_lock();
We used to lock here..
> if (ifindex) {
> - netdev = __dev_get_by_index(net, ifindex);
> - if (netdev && netdev->stat_ops) {
> + netdev = netdev_get_by_index_lock_ops_compat(net, ifindex);
> + if (!netdev) {
> + NL_SET_BAD_ATTR(info->extack,
> + info->attrs[NETDEV_A_QSTATS_IFINDEX]);
> + return -ENODEV;
> + }
I guess netdev_get_by_index_lock_ops_compat acquires the lock when it
returns success?
> + if (netdev->stat_ops) {
> err = netdev_nl_qstats_get_dump_one(netdev, scope, skb,
> info, ctx);
> } else {
> NL_SET_BAD_ATTR(info->extack,
> info->attrs[NETDEV_A_QSTATS_IFINDEX]);
> - err = netdev ? -EOPNOTSUPP : -ENODEV;
> - }
> - } else {
But there's an else branch here so now I'm confused with how this
locking works.
> - for_each_netdev_dump(net, netdev, ctx->ifindex) {
> - err = netdev_nl_qstats_get_dump_one(netdev, scope, skb,
> - info, ctx);
> - if (err < 0)
> - break;
> + err = -EOPNOTSUPP;
> }
> + netdev_unlock_ops_compat(netdev);
And we call netdev_unlock_ops_compat() here... but I don't see how this
branch acquired the lock?
> + return err;
> + }
> +
> + for_each_netdev_lock_ops_compat_scoped(net, netdev, ctx->ifindex) {
> + err = netdev_nl_qstats_get_dump_one(netdev, scope, skb,
> + info, ctx);
> + if (err < 0)
> + break;
This looks like its scope guarded so its fine.
> }
> - rtnl_unlock();
>
What am I missing?
Powered by blists - more mailing lists