netdev - Re: [PATCH net-next v2 6/9] hsr: Implement more robust duplicate discard for HSR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXuIAFJ_m6gMUFFO@thinkpad>
Date: Thu, 29 Jan 2026 17:17:04 +0100
From: Felix Maurer <fmaurer@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
	jkarrenpalo@...il.com, tglx@...utronix.de, mingo@...nel.org,
	allison.henderson@...cle.com, petrm@...dia.com, antonio@...nvpn.net,
	Yoann Congal <yoann.congal@...le.fr>
Subject: Re: [PATCH net-next v2 6/9] hsr: Implement more robust duplicate
 discard for HSR

On Thu, Jan 29, 2026 at 03:43:48PM +0100, Sebastian Andrzej Siewior wrote:
> On 2026-01-22 15:57:01 [+0100], Felix Maurer wrote:
> …
> > As the problem (we accidentally skip over a sequence number that has not
> > been received but will be received in the future) is similar to PRP, we can
> > apply a similar solution. The duplicate discard algorithm based on the
> > "sparse bitmap" works well for HSR if it is extended to track one bitmap
> > for each port (A, B, master, interlink). To do this, change the sequence
> > number blocks to contain a flexible array member as the last member that
> > can keep chunks for as many bitmaps as we need. This design makes it easy
> > to reuse the same algorithm in a potential PRP RedBox implementation.
>
> I know you just "copy" the logic based on what we have now but…
> Why do we have to track the sequence number for A, B and interlink? The
> 'master' port is what we feed into the stack so this needs to be
> de-duplicated. I am not sure how 'interlink' works so I keep quiet here.
> But A and B? There shouldn't be any duplicates on A and B unless the
> destination node forwards the node. Or do I miss something?
> I'm bringing this up because limiting to one (or two since I am unsure
> about interlink) would save some memory and avoid needless updates. And
> if you have HW-offloading enabled then you shouldn't see any packets
> which are not directed to _this_ node.

About the interlink: that's the interface where you attach devices that
know nothing about HSR, i.e., when we are a RedBox. I consider it very
similar to the master port, it's our responsibility to de-duplicate what
we send out there.

I was thinking about exactly this while working on the patch as well and
I came to each conclusion (A,B are needed vs. are not needed) at least
once. In the end, I think we will need it. It's right that a well
behaving node in the ring should not forward "frames for which the node
is the unique destination" (5.3.2.1). But there could be frames that
have no unique destination in the ring at all: multicast frames or
frames addressed to a non-existing MAC address. We should not forward
such frames either to prevent them from looping forever.

Now, such frames should probably only reach back to us, if we sent them
(either from our stack or from the interlink port). We could track
sequence numbers sent to master and interlink (for de-duplication) and
sent by master and interlink (for loop prevention) for more clarity, but
then we're back at four bitmaps again.

I agree that we can and should optimize the HW-offloaded case. I'd
suggest to do that in a separate patchset, though, partly because I
don't have access to a hardware with HSR offload at the moment.

> …
> > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> > index 70b3819e9298..c9bd25e6a7cc 100644
> > --- a/net/hsr/hsr_framereg.c
> > +++ b/net/hsr/hsr_framereg.c
> > @@ -805,6 +774,39 @@ void *hsr_get_next_node(struct hsr_priv *hsr, void *_pos,
> >  	return NULL;
> >  }
> >
> > +/* Fill the last sequence number that has been received from node on if1 by
> > + * finding the last sequence number sent on port B; accordingly get the last
> > + * received sequence number for if2 using sent sequence numbers on port A.
> > + */
> > +static void fill_last_seq_nrs(struct hsr_node *node, u16 *if1_seq, u16 *if2_seq)
> > +{
> > +	struct hsr_seq_block *block;
> > +	unsigned int block_off;
> > +	size_t block_sz;
> > +	u16 seq_bit;
> > +
> > +	spin_lock_bh(&node->seq_out_lock);
> > +
> > +	// Get last inserted block
> > +	block_off = (node->next_block - 1) & (HSR_MAX_SEQ_BLOCKS - 1);
> > +	block_sz = hsr_seq_block_size(node);
> > +	block = node->block_buf + block_off * block_sz;
> > +
> > +	if (!bitmap_empty(block->seq_nrs[HSR_PT_SLAVE_B - 1],
> > +			  HSR_SEQ_BLOCK_SIZE)) {
> > +		seq_bit = find_last_bit(block->seq_nrs[HSR_PT_SLAVE_B - 1],
> > +					HSR_SEQ_BLOCK_SIZE);
> > +		*if1_seq = (block->block_idx << HSR_SEQ_BLOCK_SHIFT) | seq_bit;
> > +	}
> > +	if (!bitmap_empty(block->seq_nrs[HSR_PT_SLAVE_A - 1],
> > +			  HSR_SEQ_BLOCK_SIZE)) {
> > +		seq_bit = find_last_bit(block->seq_nrs[HSR_PT_SLAVE_A - 1],
> > +					HSR_SEQ_BLOCK_SIZE);
> > +		*if2_seq = (block->block_idx << HSR_SEQ_BLOCK_SHIFT) | seq_bit;
> > +	}
> > +	spin_unlock_bh(&node->seq_out_lock);
>
> I know you preserve what is already here but what is this even used for?
> | ip -d link show dev hsr0
>
> does not show these numbers. It shows the sequence number of the hsr0
> interface which I understand.
> But then it is also possible to show the last received sequence number
> of any node on either of the two interfaces?

This is only exposed through the HSR_C_GET_NODE_STATUS netlink command
for the "HSR" (sic!) netlink family. I'm not aware of a userspace tool
that actually shows this data. It's unclear if it is of any real use.

I assume that the two sequence numbers are reported because they would
in theory allow to detect a broken ring: if one of the numbers stops
increasing or the numbers generally diverge, some link in the ring is
broken. If we have traffic from multiple nodes, we could even detect
where the ring is broken. That's likely also the reason for the weird
cross-assinment of B->1, A->2. What we send out on B has to have arrived
on port A resp. 1 before.

We can discuss removing this stuff in the future, but I'd prefer not to
touch userspace at this time.

> > +}
> > +
> >  int hsr_get_node_data(struct hsr_priv *hsr,
> >  		      const unsigned char *addr,
> >  		      unsigned char addr_b[ETH_ALEN],
> …
> > --- a/net/hsr/hsr_framereg.h
> > +++ b/net/hsr/hsr_framereg.h
> > @@ -82,15 +82,18 @@ int prp_register_frame_out(struct hsr_port *port, struct hsr_frame_info *frame);
> >  #define hsr_seq_block_index(sequence_nr) ((sequence_nr) >> HSR_SEQ_BLOCK_SHIFT)
> >  #define hsr_seq_block_bit(sequence_nr) ((sequence_nr) & HSR_SEQ_BLOCK_MASK)
> >
> > +#define DECLARE_BITMAP_FLEX_ARRAY(name, bits) \
> > +	unsigned long name[][BITS_TO_LONGS(bits)]
> > +
> >  struct hsr_seq_block {
> >  	unsigned long		time;
> >  	u16			block_idx;
> > -	DECLARE_BITMAP(seq_nrs, HSR_SEQ_BLOCK_SIZE);
> > +	DECLARE_BITMAP_FLEX_ARRAY(seq_nrs, HSR_SEQ_BLOCK_SIZE);
>
> is there a story behind DECLARE_BITMAP_FLEX_ARRAY()? We have just this
> one user.

I've added it this way to make it obvious that almost all of it is the
same as DECLARE_BITMAP(). But probably it's better to get rid of the
macro, add the definition directly to the struct, and maybe add a
comment that it's supposed to be similar to what DECLARE_BITMAP()
produces. What do you think?

Thanks for your comments,
   Felix


> >  };
>
> Sebastian