[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240329102308.GM651713@kernel.org>
Date: Fri, 29 Mar 2024 10:23:08 +0000
From: Simon Horman <horms@...nel.org>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
Johannes Berg <johannes@...solutions.net>, netdev@...r.kernel.org
Subject: Re: [PATCH 0/3] using guard/__free in networking
On Tue, Mar 26, 2024 at 11:23:19AM -0400, Willem de Bruijn wrote:
> Jakub Kicinski wrote:
> > On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> > > On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > > > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:
> > > > > Hi,
> > > > >
> > > > > So I started playing with this for wifi, and overall that
> > > > > does look pretty nice, but it's a bit weird if we can do
> > > > >
> > > > > guard(wiphy)(&rdev->wiphy);
> > > > >
> > > > > or so, but still have to manually handle the RTNL in the
> > > > > same code.
> > > >
> > > > Dunno, it locks code instead of data accesses.
> > >
> > > Well, I'm not sure that's a fair complaint. After all, without any more
> > > compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> > > Clearly
> > >
> > > rtnl_lock();
> > > // something
> > > rtnl_unlock();
> > >
> > > also locks the "// something" code, after all., and yeah that might be
> > > doing data accesses, but it might also be a function call or a whole
> > > bunch of other things?
> > >
> > > Or if you look at something like bpf_xdp_link_attach(), I don't think
> > > you can really say that it locks only data. That doesn't even do the
> > > allocation outside the lock (though I did convert that one to
> > > scoped_guard because of that.)
> > >
> > > Or even something simple like unregister_netdev(), it just requires the
> > > RTNL for some data accesses and consistency deep inside
> > > unregister_netdevice(), not for any specific data accessed there.
> > >
> > > So yeah, this is always going to be a trade-off, but all the locking is.
> > > We even make similar trade-offs manually, e.g. look at
> > > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > > still, for no good reason other than simplifying the cleanup path there.
> >
> > At least to me the mental model is different. 99% of the time the guard
> > is covering the entire body. So now we're moving from "I'm touching X
> > so I need to lock" to "This _function_ is safe to touch X".
> >
> > > Anyway, I can live with it either way (unless you tell me you won't pull
> > > wireless code using guard), just thought doing the wireless locking with
> > > guard and the RTNL around it without it (only in a few places do we
> > > still use RTNL though) looked odd.
> > >
> > >
> > > > Forgive the comparison but it feels too much like Java to me :)
> > >
> > > Heh. Haven't used Java in 20 years or so...
> >
> > I only did at uni, but I think they had a decorator for a method, where
> > you can basically say "this method should be under lock X" and runtime
> > will take that lock before entering and drop it after exit,
> > appropriately. I wonder why the sudden love for this concept :S
> > Is it also present in Rust or some such?
> >
> > > > scoped_guard is fine, the guard() not so much.
> > >
> > > I think you can't get scoped_guard() without guard(), so does that mean
> > > you'd accept the first patch in the series?
> >
> > How can we get one without the other.. do you reckon Joe P would let us
> > add a checkpatch check to warn people against pure guard() under net/ ?
> >
> > > > Do you have a piece of code in wireless where the conversion
> > > > made you go "wow, this is so much cleaner"?
> > >
> > > Mostly long and complex error paths. Found a double-unlock bug (in
> > > iwlwifi) too, when converting some locking there.
> > >
> > > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > > lines of unlocking, mostly error handling, code.
> > >
> > > Doing __free() too will probably clean up even more.
> >
> > Not super convinced by that one either:
> > https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> > maybe I'm too conservative..
>
> +1 on the concept (fwiw).
>
> Even the simple examples, such as unregister_netdevice_notifier_net,
> show how it avoids boilerplate and so simplifies control flow.
>
> That benefit multiplies with the number of resources held and number
> of exit paths. Or in our case, gotos and (unlock) labels.
>
> Error paths are notorious for seeing little test coverage and leaking
> resources. This is an easy class of bugs that this RAII squashes.
+1
While I'm ambivalent to the constructs that have been proposed,
I do see leaking resources in error paths as a very common pattern.
Especially in new code. Making that easier to get right seems
worthwhile to me.
>
> Sprinkling guard statements anywhere in the scope itself makes it
> perhaps hard to follow. Perhaps a heuristic would be to require these
> statements at the start of scope (after variable declaration)?
>
> Function level decorators could further inform static analysis.
> But that is somewhat tangential.
>
Powered by blists - more mailing lists