netdev - Re: [PATCH 0/3] using guard/_

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6602e8671ecd0_1408f4294cf@willemb.c.googlers.com.notmuch>
Date: Tue, 26 Mar 2024 11:23:19 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, 
 Johannes Berg <johannes@...solutions.net>
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH 0/3] using guard/__free in networking

Jakub Kicinski wrote:
> On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> > On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:  
> > > > Hi,
> > > > 
> > > > So I started playing with this for wifi, and overall that
> > > > does look pretty nice, but it's a bit weird if we can do
> > > > 
> > > >   guard(wiphy)(&rdev->wiphy);
> > > > 
> > > > or so, but still have to manually handle the RTNL in the
> > > > same code.  
> > > 
> > > Dunno, it locks code instead of data accesses.  
> > 
> > Well, I'm not sure that's a fair complaint. After all, without any more
> > compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> > Clearly
> > 
> > 	rtnl_lock();
> > 	// something
> > 	rtnl_unlock();
> > 
> > also locks the "// something" code, after all., and yeah that might be
> > doing data accesses, but it might also be a function call or a whole
> > bunch of other things?
> > 
> > Or if you look at something like bpf_xdp_link_attach(), I don't think
> > you can really say that it locks only data. That doesn't even do the
> > allocation outside the lock (though I did convert that one to
> > scoped_guard because of that.)
> > 
> > Or even something simple like unregister_netdev(), it just requires the
> > RTNL for some data accesses and consistency deep inside
> > unregister_netdevice(), not for any specific data accessed there.
> > 
> > So yeah, this is always going to be a trade-off, but all the locking is.
> > We even make similar trade-offs manually, e.g. look at
> > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > still, for no good reason other than simplifying the cleanup path there.
> 
> At least to me the mental model is different. 99% of the time the guard
> is covering the entire body. So now we're moving from "I'm touching X
> so I need to lock" to "This _function_ is safe to touch X".
> 
> > Anyway, I can live with it either way (unless you tell me you won't pull
> > wireless code using guard), just thought doing the wireless locking with
> > guard and the RTNL around it without it (only in a few places do we
> > still use RTNL though) looked odd.
> > 
> > 
> > > Forgive the comparison but it feels too much like Java to me :)  
> > 
> > Heh. Haven't used Java in 20 years or so...
> 
> I only did at uni, but I think they had a decorator for a method, where
> you can basically say "this method should be under lock X" and runtime
> will take that lock before entering and drop it after exit,
> appropriately. I wonder why the sudden love for this concept :S
> Is it also present in Rust or some such?
> 
> > > scoped_guard is fine, the guard() not so much.  
> > 
> > I think you can't get scoped_guard() without guard(), so does that mean
> > you'd accept the first patch in the series?
> 
> How can we get one without the other.. do you reckon Joe P would let us
> add a checkpatch check to warn people against pure guard() under net/ ?
> 
> > > Do you have a piece of code in wireless where the conversion
> > > made you go "wow, this is so much cleaner"?  
> > 
> > Mostly long and complex error paths. Found a double-unlock bug (in
> > iwlwifi) too, when converting some locking there.
> > 
> > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > lines of unlocking, mostly error handling, code.
> > 
> > Doing __free() too will probably clean up even more.
> 
> Not super convinced by that one either:
> https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> maybe I'm too conservative..

+1 on the concept (fwiw).

Even the simple examples, such as unregister_netdevice_notifier_net,
show how it avoids boilerplate and so simplifies control flow.

That benefit multiplies with the number of resources held and number
of exit paths. Or in our case, gotos and (unlock) labels.

Error paths are notorious for seeing little test coverage and leaking
resources. This is an easy class of bugs that this RAII squashes.

Sprinkling guard statements anywhere in the scope itself makes it
perhaps hard to follow. Perhaps a heuristic would be to require these
statements at the start of scope (after variable declaration)?

Function level decorators could further inform static analysis.
But that is somewhat tangential.