[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58b1501ddd7299164cf71768f10d0f4771b18b4f.camel@sipsolutions.net>
Date: Fri, 19 Jul 2024 09:37:28 -0700
From: Johannes Berg <johannes@...solutions.net>
To: Jay Vosburgh <jv@...sburgh.net>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
syzbot+2120b9a8f96b3fa90bad@...kaller.appspotmail.com, Hillf Danton
<hdanton@...a.com>
Subject: Re: [RFC PATCH 2/2] net: bonding: don't call ethtool methods under
RCU
On Thu, 2024-07-18 at 14:12 -0700, Jay Vosburgh wrote:
>
> We can't do this, as it will hit RTNL every monitor interval,
> which can be many times per second.
Fair.
> The logic is structured to
> specifically avoid acquiring RTNL during the inspection pass.
We also cannot do _that_, however, it's just broken with devices that
want to sleep there. Arguably the common ethtool op that syzbot
complains about is just a distraction, because while that does sleep
(now), it's also equivalent to use_carrier==1, so we could just say "oh
if it's the common ethtool op then use the carrier directly", but while
that'd prevent syzbot from reporting the issue again, it'd not actually
fix the problem with all the USB drivers etc.
> The issue that szybot is seeing only happens if bonding's
> use_carrier option is set to 0, which is not the normal case.
Sure, but like I said above, syzbot doesn't really matter. Don't get too
hung up on it.
> use_carrier is a backwards compatibility option from years ago for
> drivers that do not implement netif_carrier_on/off (and thus calling
> netif_carrier_ok() would be unreliable).
Sure, OK.
> This also came up in [0], and looking now I see there's a patch
> that syzbot tested, although I haven't reviewed it.
+Hillf, I believe that patch is broken because it completely defeats the
purpose of my original patch there, and also addresses only the "syzbot
complains about common ethtool op" issue, not the more general problem.
> Another option is to for the Powers That Be to declare that it's
> safe to assume that network drivers implement netif_carrier_on/off to
> advertise their link state, in which case the use_carrier logic in
> bonding can be removed.
No objection to that, if you don't have proper carrier reporting then a
lot of other things will likely be broken anyway?
> Or we can somehow isolate the "must acquire RTNL instead of RCU"
> to the problematic use_carrier=0 path, but that's a nontrivial change.
>
I guess.
johannes
Powered by blists - more mailing lists