netdev - Re: [RFC net-next 04/10] devlink: always check if the devlink instance is registered

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230102151630.4aeaef00@kernel.org>
Date:   Mon, 2 Jan 2023 15:16:30 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     jacob.e.keller@...el.com, leon@...nel.org, netdev@...r.kernel.org
Subject: Re: [RFC net-next 04/10] devlink: always check if the devlink
 instance is registered

On Mon, 2 Jan 2023 15:57:24 +0100 Jiri Pirko wrote:
> Sat, Dec 17, 2022 at 02:19:47AM CET, kuba@...nel.org wrote:
> >Always check under the instance lock whether the devlink instance
> >is still / already registered.
> >
> >This is a no-op for the most part, as the unregistration path currently
> >waits for all references. On the init path, however, we may temporarily
> >open up a race with netdev code, if netdevs are registered before the
> >devlink instance. This is temporary, the next change fixes it, and this
> >commit has been split out for the ease of review.
> >
> >Note that in case of iterating over sub-objects which have their
> >own lock (regions and line cards) we assume an implicit dependency
> >between those objects existing and devlink unregistration.

> >diff --git a/net/devlink/basic.c b/net/devlink/basic.c
> >index 5f33d74eef83..6b18e70a39fd 100644
> >--- a/net/devlink/basic.c
> >+++ b/net/devlink/basic.c
> >@@ -2130,6 +2130,9 @@ static int devlink_nl_cmd_linecard_get_dumpit(struct sk_buff *msg,
> > 		int idx = 0;
> > 
> > 		mutex_lock(&devlink->linecards_lock);
> >+		if (!devl_is_alive(devlink))
> >+			goto next_devlink;  
> 
> Thinking about this a bit more, things would be cleaner if reporters and
> linecards are converted to rely on instance lock as well. I don't see a
> good reason for a separate lock in both cases, really.

We had discussion before, I'm pretty sure.
IIRC you said that mlx4's locking prevents us from using the instance
lock for regions.

> Also, we could introduce devlinks_xa_for_each_registered_get_lock()
> iterator that would lock the instance as well right away to avoid
> this devl_is_alive() dance on multiple places when you iterate devlinks.

That's what I started with, but the ability to factor our the
unlock/put on error paths made the callback approach much cleaner.
And after using the callback for all the dumps there's only a couple
places which would use devlinks_xa_for_each_registered_get_lock().

> >@@ -12218,7 +12232,8 @@ void devlink_compat_running_version(struct devlink *devlink,
> > 		return;
> > 
> > 	devl_lock(devlink);  
> 
> How about to have a helper, something like devl_lock_alive() (or
> devl_lock_registered() with the naming scheme I suggest in the other
> thread)? Then you can do:
> 
> 	if (!devl_lock_alive(devlink))
> 		return;
> 	__devlink_compat_running_version(devlink, buf, len);
> 	devl_unlock(devlink);

I guess aesthetic preference.

If I had the cycles I'd make devlink_try_get() return a wrapped type

struct devlink_ref {
	struct devlink *devlink;
};

which one would have to pass to devl_lock_from_ref() or some such:

struct devlink *devl_lock_from_ref(struct devlink_ref dref)
{
	if (!dref.devlink)
		return NULL;
	devl_lock(dref.devlink);
	if (devl_lock_alive(dref.devlink))
		return dref.devlink;
	devl_unlock(dref.devlink);
	return NULL;
}

But the number of calls to devl_is_alive() is quite small after all
the cleanup, so I don't think the extra helpers are justified at this
point. "Normal coders" should not be exposed to any of the lifetime
details, not when coding the drivers, not when adding typical devlink
features/subobjects.