[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLKKX7+opENOa2oQpH_JmpPYeDPZtb+srYF2O3UgdUT5g@mail.gmail.com>
Date: Wed, 17 Nov 2021 19:34:40 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Hangbin Liu <liuhangbin@...il.com>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jay.vosburgh@...onical.com>,
Veaceslav Falico <vfalico@...il.com>,
Andy Gospodarek <andy@...yhouse.net>,
Jarod Wilson <jarod@...hat.com>,
Jakub Kicinski <kuba@...nel.org>,
Jiri Pirko <jiri@...nulli.us>, davem@...emloft.net,
Denis Kirjanov <dkirjanov@...e.de>,
David Ahern <dsahern@...il.com>
Subject: Re: [DISCUSS] Bond arp monitor not works with veth due to flag NETIF_F_LLTX.
On Wed, Nov 17, 2021 at 7:11 PM Hangbin Liu <liuhangbin@...il.com> wrote:
>
> Hi,
>
> When I test bond arp monitor with veth interface, the bond link flaps rapidly.
> After checking, in bond_ab_arp_inspect():
>
> trans_start = dev_trans_start(slave->dev);
> if (bond_is_active_slave(slave) &&
> (!bond_time_in_interval(bond, trans_start, bond->params.missed_max) ||
> !bond_time_in_interval(bond, last_rx, bond->params.missed_max))) {
> bond_propose_link_state(slave, BOND_LINK_DOWN);
> commit++;
> }
>
> it checks both slave's trans_start and last_rx. While veth interface's
> trans_start never get updated due to flag "NETIF_F_LLTX". As when NETIF_F_LLTX
> set, in netdev_start_xmit() -> txq_trans_update() the txq->trans_start
> never get updated because txq->xmit_lock_owner is always -1.
>
> If we remove the flag NETIF_F_LLTX, the HARD_TX_LOCK() will acquire the
> spin_lock and update txq->xmit_lock_owner. I expected there may have some
> performance drop. But I tested with xdp_redirect_map and pktgen by forwarding
> a 10G NIC's traffic to veth interface and didn't see much performance drop. e.g.
> With xdpgeneric mode, with the flag, it's 2.18M pps, after removing the flag,
> it's 2.11M pps. Not sure if I missed anything.
A non contended lock is not that expensive.
Have you tried using many threads, instead of mono-thread pktgen ?
This will likely be bad.
>
> So what do you think? Should we remove this flag on veth to fix the issue?
> Some user may want to use bonding active-backup arp monitor mode on netns.
Removing LLTX will have performance impact.
Updating ->trans_start at most once per jiffy should be ok.
Here is a patch against net-next
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index 750cea23e9cd02bba139a58553c4b1753956ad10..f706efecf0555974e8df562084d5770cc62126e1
100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -356,7 +356,7 @@ static void
am65_cpsw_nuss_ndo_host_tx_timeout(struct net_device *ndev,
if (netif_tx_queue_stopped(netif_txq)) {
/* try recover if stopped by us */
- txq_trans_update(netif_txq);
+ txq_trans_cond_update(netif_txq);
netif_tx_wake_queue(netif_txq);
}
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4f4a299e92de7ba9f61507ad4df7e334775c07a6..dfd2f38023c75fb6a4181af4b4a511a9955e6a0b
100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4098,12 +4098,6 @@ static inline void __netif_tx_unlock_bh(struct
netdev_queue *txq)
/*
* txq->trans_start can be read locklessly from dev_watchdog()
*/
-static inline void txq_trans_update(struct netdev_queue *txq)
-{
- if (txq->xmit_lock_owner != -1)
- WRITE_ONCE(txq->trans_start, jiffies);
-}
-
static inline void txq_trans_cond_update(struct netdev_queue *txq)
{
unsigned long now = jiffies;
@@ -4626,7 +4620,7 @@ static inline netdev_tx_t
netdev_start_xmit(struct sk_buff *skb, struct net_devi
rc = __netdev_start_xmit(ops, skb, dev, more);
if (rc == NETDEV_TX_OK)
- txq_trans_update(txq);
+ txq_trans_cond_update(txq);
return rc;
}
Powered by blists - more mailing lists