linux-kernel - Re: [PATCH net-next v2] netdevsim: implement peer queue flow control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aG5FrObkP+S8cRZh@gmail.com>
Date: Wed, 9 Jul 2025 03:34:20 -0700
From: Breno Leitao <leitao@...ian.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	kernel-team@...a.com, dw@...idwei.uk
Subject: Re: [PATCH net-next v2] netdevsim: implement peer queue flow control

Hello Jakub,

On Tue, Jul 08, 2025 at 06:27:18PM -0700, Jakub Kicinski wrote:
> On Thu, 03 Jul 2025 06:09:31 -0700 Breno Leitao wrote:
> > +static int nsim_napi_rx(struct net_device *dev, struct nsim_rq *rq,
> > +			struct sk_buff *skb)
> >  {
> >  	if (skb_queue_len(&rq->skb_queue) > NSIM_RING_SIZE) {
> > +		nsim_stop_peer_tx_queue(dev, rq, skb_get_queue_mapping(skb));
> >  		dev_kfree_skb_any(skb);
> >  		return NET_RX_DROP;
> >  	}
> 
> we should probably add:
> 
> 	if (skb_queue_len(&rq->skb_queue) > NSIM_RING_SIZE)
> 		nsim_stop_tx_queue(dev, rq, skb_get_queue_mapping(skb));
> 
> after enqueuing the skb, so that we stop the queue before any drops
> happen

Agree, we can stop the queue when queueing the packets instead. Since we
need to check for the queue numbers, we cannot call nsim_stop_tx_queue()
straight away. I think we still need to have a helper
(nsim_stop_tx_queue). This is what I have in mind:

	static void nsim_stop_tx_queue(struct net_device *tx_dev,
					struct net_device *rx_dev,
					struct nsim_rq *rq,
					u16 idx)
	{
		/* If different queues size, do not stop, since it is not
		* easy to find which TX queue is mapped here
		*/
		if (rx_dev->real_num_tx_queues != tx_dev->num_rx_queues)
			return;

		/* rq is the queue on the receive side */
		netif_subqueue_try_stop(tx_dev, idx,
					NSIM_RING_SIZE - skb_queue_len(&rq->skb_queue),
					NSIM_RING_SIZE / 2);
	}

	static int nsim_napi_rx(struct net_device *tx_dev, struct net_device *rx_dev,
				struct nsim_rq *rq, struct sk_buff *skb)
	{
		if (skb_queue_len(&rq->skb_queue) > NSIM_RING_SIZE) {
			dev_kfree_skb_any(skb);
			return NET_RX_DROP;
		}

		skb_queue_tail(&rq->skb_queue, skb);

		/* Stop the peer TX queue avoiding dropping packets later */
		if (skb_queue_len(&rq->skb_queue) >= NSIM_RING_SIZE)
			nsim_stop_tx_queue(tx_dev, rx_dev, rq,
					skb_get_queue_mapping(skb));

		return NET_RX_SUCCESS;
	}

> > @@ -51,7 +109,7 @@ static int nsim_napi_rx(struct nsim_rq *rq, struct sk_buff *skb)
> >  static int nsim_forward_skb(struct net_device *dev, struct sk_buff *skb,
> >  			    struct nsim_rq *rq)
> >  {
> > -	return __dev_forward_skb(dev, skb) ?: nsim_napi_rx(rq, skb);
> > +	return __dev_forward_skb(dev, skb) ?: nsim_napi_rx(dev, rq, skb);
> >  }
> >  
> >  static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
> 
> nsim_start_xmit() has both dev and peer_dev, pass them all the way to
> nsim_stop_peer_tx_queue() so that you don't have to try to dereference
> the peer again.

Sure. This is a good idea. I am using it, as you can see in the snippet
above.

> > +	if (dev->real_num_tx_queues != peer_dev->num_rx_queues)
> 
> given that we compare real_num_tx_queues I think we should also kick
> the queues in nsim_set_channels(), like we do in unlink_device_store()

Sure. I suppose something like the following. What do you think?

	nsim_set_channels(struct net_device *dev, struct ethtool_channels *ch)
	{
		struct netdevsim *ns = netdev_priv(dev);
	+       struct netdevsim *peer;
		int err;

		err = netif_set_real_num_queues(dev, ch->combined_count,
	@@ -113,6 +114,14 @@ nsim_set_channels(struct net_device *dev, struct ethtool_channels *ch)
			return err;

		ns->ethtool.channels = ch->combined_count;
	+
	+	synchronize_net();
	+       netif_tx_wake_all_queues(dev);
	+       rcu_read_lock();
	+       peer = rcu_dereference(ns->peer);
	+       if (peer)
	+               netif_tx_wake_all_queues(peer->netdev);
	+       rcu_read_unlock();
	+
		return 0;
	}


Also, with this patch, we will eventually get the following critical
message:

	net_crit_ratelimited("Virtual device %s asks to queue packet!\n", dev->name);

I am wondering if that alert is not valid anymore, and I can simply
remove it.

Thanks for your review!
--breno