lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEss33CcmYvBRa7kyfWEYwnm6xn6_tBo81y9X20yyrPKoQ@mail.gmail.com>
Date: Thu, 14 Aug 2025 11:45:02 +0800
From: Jason Wang <jasowang@...hat.com>
To: Simon Schippers <simon.schippers@...dortmund.de>
Cc: Stephen Hemminger <stephen@...workplumber.org>, willemdebruijn.kernel@...il.com, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Tim Gebauer <tim.gebauer@...dortmund.de>
Subject: Re: [PATCH net v2] TUN/TAP: Improving throughput and latency by
 avoiding SKB drops

On Thu, Aug 14, 2025 at 2:34 AM Simon Schippers
<simon.schippers@...dortmund.de> wrote:
>
> Stephen Hemminger wrote:
> > On Tue, 12 Aug 2025 00:03:48 +0200
> > Simon Schippers <simon.schippers@...dortmund.de> wrote:
> >
> >> This patch is the result of our paper with the title "The NODROP Patch:
> >> Hardening Secure Networking for Real-time Teleoperation by Preventing
> >> Packet Drops in the Linux TUN Driver" [1].
> >> It deals with the tun_net_xmit function which drops SKB's with the reason
> >> SKB_DROP_REASON_FULL_RING whenever the tx_ring (TUN queue) is full,
> >> resulting in reduced TCP performance and packet loss for bursty video
> >> streams when used over VPN's.
> >>
> >> The abstract reads as follows:
> >> "Throughput-critical teleoperation requires robust and low-latency
> >> communication to ensure safety and performance. Often, these kinds of
> >> applications are implemented in Linux-based operating systems and transmit
> >> over virtual private networks, which ensure encryption and ease of use by
> >> providing a dedicated tunneling interface (TUN) to user space
> >> applications. In this work, we identified a specific behavior in the Linux
> >> TUN driver, which results in significant performance degradation due to
> >> the sender stack silently dropping packets. This design issue drastically
> >> impacts real-time video streaming, inducing up to 29 % packet loss with
> >> noticeable video artifacts when the internal queue of the TUN driver is
> >> reduced to 25 packets to minimize latency. Furthermore, a small queue
> >> length also drastically reduces the throughput of TCP traffic due to many
> >> retransmissions. Instead, with our open-source NODROP Patch, we propose
> >> generating backpressure in case of burst traffic or network congestion.
> >> The patch effectively addresses the packet-dropping behavior, hardening
> >> real-time video streaming and improving TCP throughput by 36 % in high
> >> latency scenarios."
> >>
> >> In addition to the mentioned performance and latency improvements for VPN
> >> applications, this patch also allows the proper usage of qdisc's. For
> >> example a fq_codel can not control the queuing delay when packets are
> >> already dropped in the TUN driver. This issue is also described in [2].
> >>
> >> The performance evaluation of the paper (see Fig. 4) showed a 4%
> >> performance hit for a single queue TUN with the default TUN queue size of
> >> 500 packets. However it is important to notice that with the proposed
> >> patch no packet drop ever occurred even with a TUN queue size of 1 packet.
> >> The utilized validation pipeline is available under [3].
> >>
> >> As the reduction of the TUN queue to a size of down to 5 packets showed no
> >> further performance hit in the paper, a reduction of the default TUN queue
> >> size might be desirable accompanying this patch. A reduction would
> >> obviously reduce buffer bloat and memory requirements.
> >>
> >> Implementation details:
> >> - The netdev queue start/stop flow control is utilized.
> >> - Compatible with multi-queue by only stopping/waking the specific
> >> netdevice subqueue.
> >> - No additional locking is used.
> >>
> >> In the tun_net_xmit function:
> >> - Stopping the subqueue is done when the tx_ring gets full after inserting
> >> the SKB into the tx_ring.
> >> - In the unlikely case when the insertion with ptr_ring_produce fails, the
> >> old dropping behavior is used for this SKB.
> >>
> >> In the tun_ring_recv function:
> >> - Waking the subqueue is done after consuming a SKB from the tx_ring when
> >> the tx_ring is empty. Waking the subqueue when the tx_ring has any
> >> available space, so when it is not full, showed crashes in our testing. We
> >> are open to suggestions.
> >> - When the tx_ring is configured to be small (for example to hold 1 SKB),
> >> queuing might be stopped in the tun_net_xmit function while at the same
> >> time, ptr_ring_consume is not able to grab a SKB. This prevents
> >> tun_net_xmit from being called again and causes tun_ring_recv to wait
> >> indefinitely for a SKB in the blocking wait queue. Therefore, the netdev
> >> queue is woken in the wait queue if it has stopped.
> >> - Because the tun_struct is required to get the tx_queue into the new txq
> >> pointer, the tun_struct is passed in tun_do_read aswell. This is likely
> >> faster then trying to get it via the tun_file tfile because it utilizes a
> >> rcu lock.
> >>
> >> We are open to suggestions regarding the implementation :)
> >> Thank you for your work!
> >>
> >> [1] Link:
> >> https://cni.etit.tu-dortmund.de/storages/cni-etit/r/Research/Publications/2025/Gebauer_2025_VTCFall/Gebauer_VTCFall2025_AuthorsVersion.pdf
> >> [2] Link:
> >> https://unix.stackexchange.com/questions/762935/traffic-shaping-ineffective-on-tun-device
> >> [3] Link: https://github.com/tudo-cni/nodrop
> >>
> >> Co-developed-by: Tim Gebauer <tim.gebauer@...dortmund.de>
> >> Signed-off-by: Tim Gebauer <tim.gebauer@...dortmund.de>
> >> Signed-off-by: Simon Schippers <simon.schippers@...dortmund.de>
> >
> > I wonder if it would be possible to implement BQL in TUN/TAP?
> >
> > https://lwn.net/Articles/454390/
> >
> > BQL provides a feedback mechanism to application when queue fills.
>
> Thank you very much for your reply,
> I also thought about BQL before and like the idea!
>
> However I see the following challenges in the implementation:
> - netdev_tx_sent_queue is no problem, it would just be called in
> tun_net_xmit function.
> - netdev_tx_completed_queue is challenging, because there is no completion
> routine like in a "normal" network driver. tun_ring_recv reads one SKB at
> a time and therefore I am not sure when and with what parameters to call
> the function.

Right, this is similar to virtio_net without TX NAPI. It would be
tricky to implement BQL on top (and TUN also did skb_orphan during
xmit).

Thanks

> - What to do with the existing TUN queue packet limit (500 packets
> default)? Use it as an upper limit?
>
> Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
> Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen.
>
> Important note: The information included in this e-mail is confidential. It is solely intended for the recipient. If you are not the intended recipient of this e-mail please contact the sender and delete this message. Thank you. Without prejudice of e-mail correspondence, our statements are only legally binding when they are made in the conventional written form (with personal signature) or when such documents are sent by fax.
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ