[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZDaj2J/2CR03H/Od@Laptop-X1>
Date: Wed, 12 Apr 2023 20:28:08 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: Jay Vosburgh <jay.vosburgh@...onical.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Jonathan Toppins <jtoppins@...hat.com>,
Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
Liang Li <liali@...hat.com>,
Simon Horman <simon.horman@...igine.com>,
Miroslav Lichvar <mlichvar@...hat.com>
Subject: Re: [PATCHv3 net-next] bonding: add software tx timestamping support
On Tue, Apr 11, 2023 at 11:33:23PM -0700, Jay Vosburgh wrote:
> Jakub Kicinski <kuba@...nel.org> wrote:
>
> >On Mon, 10 Apr 2023 16:23:51 +0800 Hangbin Liu wrote:
> >> @@ -5707,10 +5711,38 @@ static int bond_ethtool_get_ts_info(struct net_device *bond_dev,
> >> ret = ops->get_ts_info(real_dev, info);
> >> goto out;
> >> }
> >> + } else {
> >> + /* Check if all slaves support software rx/tx timestamping */
> >> + rcu_read_lock();
> >> + bond_for_each_slave_rcu(bond, slave, iter) {
> >> + ret = -1;
> >> + ops = slave->dev->ethtool_ops;
> >> + phydev = slave->dev->phydev;
> >> +
> >> + if (phy_has_tsinfo(phydev))
> >> + ret = phy_ts_info(phydev, &ts_info);
> >> + else if (ops->get_ts_info)
> >> + ret = ops->get_ts_info(slave->dev, &ts_info);
> >
> >Do we _really_ need to hold RCU lock over this?
> >Imposing atomic context restrictions on driver callbacks should not be
> >taken lightly. I'm 75% sure .ethtool_get_ts_info can only be called
> >under rtnl lock off the top of my head, is that not the case?
>
> Ok, maybe I didn't look at that carefully enough, and now that I
> do, it's really complicated.
>
> Going through it, I think the call path that's relevant is
> taprio_change -> taprio_parse_clockid -> ethtool_ops->get_ts_info.
> taprio_change is Qdisc_ops.change function, and tc_modify_qdisc should
> come in with RTNL held.
>
> If I'm reading cscope right, the other possible caller of
> Qdisc_ops.change is fifo_set_limit, and it looks like that function is
> only called by functions that are themselves Qdisc_ops.change functions
> (red_change -> __red_change, sfb_change, tbf_change) or Qdisc_ops.init
> functions (red_init -> __red_change, sfb_init, tbf_init).
>
> There's also a qdisc_create_dflt -> Qdisc_ops.init call path,
> but I don't know if literally all calls to qdisc_create_dflt hold RTNL.
>
> There's a lot of them, and I'm not sure how many of those could
> ever end up calling into taprio_change (if, say, a taprio qdisc is
> attached within another qdisc).
>
> qdisc_create also calls Qdisc_ops.init, but that one seems to
> clearly expect to enter with RTNL.
>
> Any tc expert able to state for sure whether it's possible to
> get into any of the above without RTNL? I suspect it isn't, but I'm not
> 100% sure either.
You dug more than me. Maybe we can add an ASSERT_RTNL() checking here first?
But since we can't 100% sure we are holding the rtnl lock, I think we
can keep the rcu lock for safe. I saw rlb_next_rx_slave() also did the same...
>
>
> >> + if (!ret && (ts_info.so_timestamping & SOF_TIMESTAMPING_SOFTRXTX) ==
> >> + SOF_TIMESTAMPING_SOFTRXTX) {
> >
> >You could check in this loop if TX is supported...
>
> I see your point below about not wanting to create
> SOFT_TIMESTAMPING_SOFTRXTX, but doesn't the logic need to test all three
> of the flags _TX_SOFTWARE, _RX_SOFTWARE, and _SOFTWARE?
I think Jakub means we have already add _RX_SOFTWARE and _SOFTWARE for bonding
whatever slave's flag, then we just need to check slave's _TX_SOFTWARE flag.
Thanks
Hangbin
Powered by blists - more mailing lists