netdev - Re: [PATCH 0/3] using guard/_

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0dc633a36a658b96f9ec98165e7db61a176c79e0.camel@sipsolutions.net>
Date: Tue, 26 Mar 2024 16:33:58 +0100
From: Johannes Berg <johannes@...solutions.net>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH 0/3] using guard/__free in networking

> > So yeah, this is always going to be a trade-off, but all the locking is.
> > We even make similar trade-offs manually, e.g. look at
> > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > still, for no good reason other than simplifying the cleanup path there.
> 
> At least to me the mental model is different. 99% of the time the guard
> is covering the entire body. So now we're moving from "I'm touching X
> so I need to lock" to "This _function_ is safe to touch X".

Yeah, maybe. But then we also have a lot of trivial wrappers just do to
locking, like

do_something()
{
	rtnl_lock()
	ret = __do_something()
	rtnl_unlock();
	return ret;
}

because __do_something() has a bunch of error paths and it's just messy
to maintain the locking directly :)

So I guess I don't have that much of a different model, I'd actually say
it's an advantage of sorts in many cases, and in some cases you'd still
have "rtnl_lock()" only in the middle, or scoped_guard(rtnl) {...} for a
small block. But refactoring a function because it's accessing protected
data doesn't seem completely out of the question either.

> > > Forgive the comparison but it feels too much like Java to me :)  
> > 
> > Heh. Haven't used Java in 20 years or so...
> 
> I only did at uni, but I think they had a decorator for a method, where
> you can basically say "this method should be under lock X" and runtime
> will take that lock before entering and drop it after exit,
> appropriately.

Yeah, I vaguely remember that. And yes, you can obviously just slap that
on everything and call it a day, or you could also there refactor the
things that do need the locking into a separate function, and use it
there?

> I wonder why the sudden love for this concept :S

I think it's neither sudden nor love ;-) But all the cleanup paths are
_always_ a mess to maintain, and this is the least bad thing folks came
up with so far?

> Is it also present in Rust or some such?

I have no idea. I _think_ Rust actually ties the data and the locks
together more?

> > > scoped_guard is fine, the guard() not so much.  
> > 
> > I think you can't get scoped_guard() without guard(), so does that mean
> > you'd accept the first patch in the series?
> 
> How can we get one without the other.. do you reckon Joe P would let us
> add a checkpatch check to warn people against pure guard() under net/ ?

Maybe? But I think I do want to use guard() ;-)

There are a ton of functions like say cfg80211_wext_siwrts() or
nl80211_new_interface() that really just want to hold the mutex for the
(remainder of) the function. And really _nl80211_new_interface()
wouldn't even exist if we had that, because nl80211_new_interface()
would just be

nl80211_new_interface()
{
  cfg80211_destroy_ifaces(rdev);

  guard(wiphy)(&rdev->wiphy);

  // exactly existing content of _nl80211_new_interface()
  // with all the returns etc.
}

the only reason for _nl80211_new_interface() is the locking there ...

Actually, we might push that further down into the function, to just
before rdev_add_virtual_intf(), I guess? But it all just adds to the
complexity as long as it's not cleaned up automatically.

> > > Do you have a piece of code in wireless where the conversion
> > > made you go "wow, this is so much cleaner"?  
> > 
> > Mostly long and complex error paths. Found a double-unlock bug (in
> > iwlwifi) too, when converting some locking there.
> > 
> > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > lines of unlocking, mostly error handling, code.
> > 
> > Doing __free() too will probably clean up even more.
> 
> Not super convinced by that one either:
> https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> maybe I'm too conservative..

I mean ... it's not great, and it's all new stuff to learn (especially
those caveats with the cleanup order), but error paths are the things
that are really never tested and tend to be broken, no matter that we
have smatch/sparse/coverity/etc.

So while I don't think it's perfect, and I tend to agree even that it
encourages "over-locking" (though we can refactor and use scope etc.
where it matters), I'm pretty firmly on the "we should be using this to
not worry about error paths all the time" side of the fence, I guess.

johannes