netdev - Re: [PATCH v3 3/3] can: esd_usb: Add watermark handling for TX jobs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b539003186430591736552a3922c441da63a336.camel@esd.eu>
Date: Fri, 26 Sep 2025 01:35:07 +0000
From: Stefan Mätje <stefan.maetje@....eu>
To: "kuba@...nel.org" <kuba@...nel.org>
CC: "mailhol@...nel.org" <mailhol@...nel.org>, "socketcan@...tkopp.net"
	<socketcan@...tkopp.net>, "davem@...emloft.net" <davem@...emloft.net>,
	"wg@...ndegger.com" <wg@...ndegger.com>, "linux-can@...r.kernel.org"
	<linux-can@...r.kernel.org>, socketcan <socketcan@....eu>, Frank Jungclaus
	<frank.jungclaus@....eu>, "mkl@...gutronix.de" <mkl@...gutronix.de>,
	"horms@...nel.org" <horms@...nel.org>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>
Subject: Re: [PATCH v3 3/3] can: esd_usb: Add watermark handling for TX jobs

Am Mittwoch, dem 24.09.2025 um 17:27 -0700 schrieb Jakub Kicinski:
> On Wed, 24 Sep 2025 19:30:35 +0200 Stefan Mätje wrote:
> > The driver tried to keep as much CAN frames as possible submitted to the
> > USB device (ESD_USB_MAX_TX_URBS). This has led to occasional "No free
> > context" error messages in high load situations like with
> > "cangen -g 0 -p 10 canX".
> 
> I grepped for "No free context" :) perhaps use the old message from
> before the previous patch, so that users who see those in the logs
> can correlate with this patch better?

I could change the message back to contain "couldn't find free context"
again.

> > Now call netif_stop_queue() already if the number of active jobs
> > reaches ESD_USB_TX_URBS_HI_WM which is < ESD_USB_MAX_TX_URBS. The
> > netif_start_queue() is called in esd_usb_tx_done_msg() only if the
> > number of active jobs is <= ESD_USB_TX_URBS_LO_WM.
> > 
> > This change eliminates the occasional error messages and significantly
> > reduces the number of calls to netif_start_queue() and
> > netif_stop_queue().
> > 
> > The watermark limits have been chosen with the CAN-USB/Micro in mind to
> > not to compromise its TX throughput. This device is running on USB 1.1
> > only with its 1ms USB polling cycle where a ESD_USB_TX_URBS_LO_WM
> > value below 9 decreases the TX throughput.
> 
> > -       netif_wake_queue(netdev);
> > +       if (atomic_read(&priv->active_tx_jobs) <= ESD_USB_TX_URBS_LO_WM)
> > +               netif_wake_queue(netdev);
> >  }
> 
> > -       /* Slow down tx path */
> > -       if (atomic_read(&priv->active_tx_jobs) >= ESD_USB_MAX_TX_URBS)
> > +       /* Slow down TX path */
> > +       if (atomic_read(&priv->active_tx_jobs) >= ESD_USB_TX_URBS_HI_WM)
> >                 netif_stop_queue(netdev);
> >  
> >         err = usb_submit_urb(urb, GFP_ATOMIC);
> 
> I don't know much about USB. Is there some locking that makes this not
> racy? I recommend using the macros from net/netdev_queues.h like
> netif_txq_maybe_stop() the re-checking on one side is key.

When I was changing the code I was under the impression that the atomic_xxx()
functions would do some memory barrier stuff to make other CPUs see the
results. 

This was because some other USB CAN drivers use very similar code snippets
to count their active TX jobs in flight (the grandparent seems to be ems_usb.c,
others: gs_usb.c, usb_8dev.c), but they always seem to call netif_wake_queue()
unconditionally.

The original code called netif_stop_queue() under the condition 
"if (atomic_read(&priv->active_tx_jobs) >= ESD_USB_MAX_TX_URBS)". But this
could lead to the "couldn't find free context". I assume this happened if
the callback handler called netif_wake_queue() via esd_usb_tx_done_msg()
after the condition check with network layer still in the start_xmit()
routine. So the network layer would try to xmit the next skb.

I changed therefore the stop condition to >= ESD_USB_TX_URBS_HI_WM which is
smaller than ESD_USB_MAX_TX_URBS so there is still at least one TX context
free after stopping that will handle an "overshoot" transmit.

On the wake side in esd_usb_tx_done_msg() if we would not see the increment
and are still under ESD_USB_TX_URBS_LO_WM only excess calls to
netif_wake_queue() would be generated. And each call of esd_usb_tx_done_msg()
would decrement the tx_active_jobs bringing this down.

At the moment I can't see clearly the impact of a race condition. What I can
imagine is that the "other side" doesn't see the changes done with atomic_inc()
and atomic_dec() making a reaction on changes of the "other side" too late.

Avoid this I could change the code to something like:

static void esd_usb_tx_done_msg(struct esd_usb_net_priv *priv,
				union esd_usb_msg *msg)
{
:
:
	/* Release context */
	context->echo_index = ESD_USB_MAX_TX_URBS;
	atomic_dec(&priv->active_tx_jobs);
	smp_mb__after_atomic();

	if (!netif_device_present(netdev))
		return;

	if (atomic_read(&priv->active_tx_jobs) <= ESD_USB_TX_URBS_LO_WM)
		netif_wake_queue(netdev);
}


and in the esd_usb_start_xmit() function:

:
:
	atomic_inc(&priv->active_tx_jobs);
	smp_mb__after_atomic();

	/* Slow down TX path */
	if (atomic_read(&priv->active_tx_jobs) >= ESD_USB_TX_URBS_HI_WM)
		netif_stop_queue(netdev);

	err = usb_submit_urb(urb, GFP_ATOMIC);
:
:

Not finding any issues with the published patch may be a side effect of 
testing mainly on x86 where the smp_mb__after_atomic() function is empty.

Using the netif_txq_maybe_stop() / netif_txq_completed_wake() pair seem
not the right thing for me because I don't understand the code completely
and the netif_txq_completed_wake() should only called from NAPI poll
context where I would need to call it from the USB callback handler
possibly on IRQ level.

How to proceed now?

Best regards,
    Stefan Mätje