lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 17 Nov 2021 19:34:40 -0800 From: Eric Dumazet <edumazet@...gle.com> To: Hangbin Liu <liuhangbin@...il.com> Cc: netdev@...r.kernel.org, Jay Vosburgh <jay.vosburgh@...onical.com>, Veaceslav Falico <vfalico@...il.com>, Andy Gospodarek <andy@...yhouse.net>, Jarod Wilson <jarod@...hat.com>, Jakub Kicinski <kuba@...nel.org>, Jiri Pirko <jiri@...nulli.us>, davem@...emloft.net, Denis Kirjanov <dkirjanov@...e.de>, David Ahern <dsahern@...il.com> Subject: Re: [DISCUSS] Bond arp monitor not works with veth due to flag NETIF_F_LLTX. On Wed, Nov 17, 2021 at 7:11 PM Hangbin Liu <liuhangbin@...il.com> wrote: > > Hi, > > When I test bond arp monitor with veth interface, the bond link flaps rapidly. > After checking, in bond_ab_arp_inspect(): > > trans_start = dev_trans_start(slave->dev); > if (bond_is_active_slave(slave) && > (!bond_time_in_interval(bond, trans_start, bond->params.missed_max) || > !bond_time_in_interval(bond, last_rx, bond->params.missed_max))) { > bond_propose_link_state(slave, BOND_LINK_DOWN); > commit++; > } > > it checks both slave's trans_start and last_rx. While veth interface's > trans_start never get updated due to flag "NETIF_F_LLTX". As when NETIF_F_LLTX > set, in netdev_start_xmit() -> txq_trans_update() the txq->trans_start > never get updated because txq->xmit_lock_owner is always -1. > > If we remove the flag NETIF_F_LLTX, the HARD_TX_LOCK() will acquire the > spin_lock and update txq->xmit_lock_owner. I expected there may have some > performance drop. But I tested with xdp_redirect_map and pktgen by forwarding > a 10G NIC's traffic to veth interface and didn't see much performance drop. e.g. > With xdpgeneric mode, with the flag, it's 2.18M pps, after removing the flag, > it's 2.11M pps. Not sure if I missed anything. A non contended lock is not that expensive. Have you tried using many threads, instead of mono-thread pktgen ? This will likely be bad. > > So what do you think? Should we remove this flag on veth to fix the issue? > Some user may want to use bonding active-backup arp monitor mode on netns. Removing LLTX will have performance impact. Updating ->trans_start at most once per jiffy should be ok. Here is a patch against net-next diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 750cea23e9cd02bba139a58553c4b1753956ad10..f706efecf0555974e8df562084d5770cc62126e1 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -356,7 +356,7 @@ static void am65_cpsw_nuss_ndo_host_tx_timeout(struct net_device *ndev, if (netif_tx_queue_stopped(netif_txq)) { /* try recover if stopped by us */ - txq_trans_update(netif_txq); + txq_trans_cond_update(netif_txq); netif_tx_wake_queue(netif_txq); } } diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 4f4a299e92de7ba9f61507ad4df7e334775c07a6..dfd2f38023c75fb6a4181af4b4a511a9955e6a0b 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -4098,12 +4098,6 @@ static inline void __netif_tx_unlock_bh(struct netdev_queue *txq) /* * txq->trans_start can be read locklessly from dev_watchdog() */ -static inline void txq_trans_update(struct netdev_queue *txq) -{ - if (txq->xmit_lock_owner != -1) - WRITE_ONCE(txq->trans_start, jiffies); -} - static inline void txq_trans_cond_update(struct netdev_queue *txq) { unsigned long now = jiffies; @@ -4626,7 +4620,7 @@ static inline netdev_tx_t netdev_start_xmit(struct sk_buff *skb, struct net_devi rc = __netdev_start_xmit(ops, skb, dev, more); if (rc == NETDEV_TX_OK) - txq_trans_update(txq); + txq_trans_cond_update(txq); return rc; }
Powered by blists - more mailing lists