[<prev] [next>] [day] [month] [year] [list]
Message-ID: <7d221a07-8358-4c0b-a09c-3b029c052245@smile.fr>
Date: Fri, 17 Nov 2023 15:06:31 +0100
From: Yoann Congal <yoann.congal@...le.fr>
To: netdev@...r.kernel.org
Cc: Florent CARLI <fcarli@...il.com>
Subject: hsr drop frames when received out-of-order (?)
Hi netdev,
We are looking into the hsr module to create a fault tolerant and "soft real-time" network.
But when experimenting, we noticed that enabling HSRv1 did *increase* frame drop on high load.
For example, with iperf at 800Mbit/s on Gigabit links:
* Without hsr, no drop
* With hsr, 0.05% drop
I'm using a recent 6.5-rt Debian kernel.
I created a script[0] to replicate this, using iperf3 and veth pairs inspired from the in-tree hsr selftest.
* First iperf3 with in-order frames : no drop
* Second iperf3 with .1% out-of-order frames : ~2% drops
After investigation, it looks like the hsr module (HSRv1) drops frame it received out-of-order.
Here my understanding on how the current hsr module work in HSRv1:
To avoid creating frame loops, the hsr module will only forward frame it has not seen before.
This is implemented in net/hsr/hsr_forward.c:hsr_forward_do().
And, the "not seen before" part in net/hsr/hsr_framereg.c:hsr_register_frame_out() : this does store the highest sequence number it saw passing through the interface.
Here is a simplified example of what I've observed:
* 2 hosts ("local" and "remote") are connected through an HSR pair to form the simplest loop
* Local host has a hsr0 interface based on hsr0A and hsr0B interfaces.
* Remote send Frame1 then Frame2 to hsr0A and hsr0B.
* For reasons I'll list later, local host see Frame2 on hsr0A first.
* hsr_forward will forward Frame2 to hsr0 (toward userland) and to hsr0B (to make the HSR ring)
* Then, when Frame1 will be received on either hsr0A or hsr0B, it be forwarded nowhere because a more recent frame has already been seen on those interface (Frame2). It will be effectively dropped and will never be seen by userland.
Out-of-order frames may seem a rare event on direct Ethernet connections but what I've seen is that the re-ordering can happens on the host:
* MSI-X: it "load-balances" the IRQs: an early frame received on a busy CPU may be seen after a later frame received on a not-busy CPU.
* Interrupt Throttling: in conjunction with some frame drop, Frame2 is seen on hsr0B before the Frame1 is seen on hsr0A (because hsr0A was IRQ-throttled)
BTW, I did not investigate as mush but we also tried the hsr module in PRP mode and also noticed drops.
Can anyone here confirm this analysis?
My idea to fix this would be to improve hsr_register_frame_out() to register the recent N frames instead of just the last one.
* Do you think that would work and could be merged?
* How hard it is to implement ? (I guess parallelism will be an issue since we do not want to mutex/lock on this path)
* Is anyone secretly working on a fix with a better approach?
Thanks for reading!
[0]: https://gist.github.com/ycongal-smile/7b42472669e83025f106c185c5159ca3#file-hsr_loop_test-sh
Regards,
--
Yoann Congal
Smile ECS - Tech Expert
Powered by blists - more mailing lists