lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <68106010-f34b-45a8-aaf5-003f5c925c01@linux.dev> Date: Tue, 10 Jun 2025 17:25:26 -0700 From: Ihor Solodrai <ihor.solodrai@...ux.dev> To: Jesper Dangaard Brouer <hawk@...nel.org>, netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>, Bastien Curutchet <bastien.curutchet@...tlin.com> Cc: bpf@...r.kernel.org, tom@...bertland.com, Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller" <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, Toke Høiland-Jørgensen <toke@...e.dk>, dsahern@...nel.org, makita.toshiaki@....ntt.co.jp, kernel-team@...udflare.com, phil@....cc, Sebastian Andrzej Siewior <bigeasy@...utronix.de> Subject: Re: [PATCH net-next V7 2/2] veth: apply qdisc backpressure on full ptr_ring to reduce TX drops On 6/10/25 2:40 PM, Jesper Dangaard Brouer wrote: > > > On 10/06/2025 20.26, Ihor Solodrai wrote: >> On 6/10/25 8:56 AM, Jesper Dangaard Brouer wrote: >>> >>> >>> On 10/06/2025 13.43, Jesper Dangaard Brouer wrote: >>>> >>>> On 10/06/2025 00.09, Ihor Solodrai wrote: >>> [...] >>>> >>>> Can you give me the output from below command (on your compiled >>>> kernel): >>>> >>>> ./scripts/faddr2line drivers/net/veth.o veth_xdp_rcv.constprop.0+0x6b >>>> >>> >>> Still need above data/info please. >> >> root@...vm7589:/ci/workspace# ./scripts/faddr2line ./kout.gcc/drivers/ >> net/veth.o veth_xdp_rcv.constprop.0+0x6b >> veth_xdp_rcv.constprop.0+0x6b/0x390: >> netdev_get_tx_queue at /ci/workspace/kout.gcc/../include/linux/ >> netdevice.h:2637 >> (inlined by) veth_xdp_rcv at /ci/workspace/kout.gcc/../drivers/net/ >> veth.c:912 >> >> Which is: >> >> veth.c:912 >> struct veth_priv *priv = netdev_priv(rq->dev); >> int queue_idx = rq->xdp_rxq.queue_index; >> struct netdev_queue *peer_txq; >> struct net_device *peer_dev; >> int i, done = 0, n_xdpf = 0; >> void *xdpf[VETH_XDP_BATCH]; >> >> /* NAPI functions as RCU section */ >> peer_dev = rcu_dereference_check(priv->peer, >> rcu_read_lock_bh_held()); >> ---> peer_txq = netdev_get_tx_queue(peer_dev, queue_idx); >> >> netdevice.h:2637 >> static inline >> struct netdev_queue *netdev_get_tx_queue(const struct net_device >> *dev, >> unsigned int index) >> { >> DEBUG_NET_WARN_ON_ONCE(index >= dev->num_tx_queues); >> ---> return &dev->_tx[index]; >> } >> >> So the suspect is peer_dev (priv->peer)? > > Yes, this is the problem! > > So, it seems that peer_dev (priv->peer) can become a NULL pointer. > > Managed to reproduce - via manually deleting the peer device: > - ip link delete dev veth42 > - while overloading veth41 via XDP redirecting packets into it. > > Managed to trigger concurrent crashes on two CPUs (C0 + C3) > - so below output gets interlaced a bit: > > [...] > > A fix could look like this: > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c > index e58a0f1b5c5b..a3046142cb8e 100644 > --- a/drivers/net/veth.c > +++ b/drivers/net/veth.c > @@ -909,7 +909,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, > > /* NAPI functions as RCU section */ > peer_dev = rcu_dereference_check(priv->peer, > rcu_read_lock_bh_held()); > - peer_txq = netdev_get_tx_queue(peer_dev, queue_idx); > + peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : > NULL; > > for (i = 0; i < budget; i++) { > void *ptr = __ptr_ring_consume(&rq->xdp_ring); > @@ -959,7 +959,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, > rq->stats.vs.xdp_packets += done; > u64_stats_update_end(&rq->stats.syncp); > > - if (unlikely(netif_tx_queue_stopped(peer_txq))) > + if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq))) > netif_tx_wake_queue(peer_txq); > Great! I presume you will send a patch separately? > > > > --Jesper > >
Powered by blists - more mailing lists