[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXt9GIlea_AUvHX9@thinkpad>
Date: Thu, 29 Jan 2026 16:30:32 +0100
From: Felix Maurer <fmaurer@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
jkarrenpalo@...il.com, tglx@...utronix.de, mingo@...nel.org,
allison.henderson@...cle.com, petrm@...dia.com, antonio@...nvpn.net,
Steffen Lindner <steffen.lindner@...abb.com>
Subject: Re: [PATCH net-next v2 4/9] hsr: Implement more robust duplicate
discard for PRP
On Thu, Jan 29, 2026 at 02:29:40PM +0100, Sebastian Andrzej Siewior wrote:
> On 2026-01-22 15:56:59 [+0100], Felix Maurer wrote:
> > The PRP duplicate discard algorithm does not work reliably with certain
> …
> > The idea of the new algorithm is to store the seen sequence numbers in a
> > bitmap. To keep the size of the bitmap in control, we store it as a "sparse
> > bitmap" where the bitmap is split into blocks and not all blocks exist at
> > the same time. The sparse bitmap is implemented using an xarray that keeps
> > the references to the individual blocks and a backing ring buffer that
> …
>
> My idea was to keep track of say the last 64 sequence numbers but I had
> no idea how to efficiently implement it with all the "forget" parts and
> so on. What you did here is quite nice.
Thank you!
> > @@ -280,6 +301,59 @@ struct hsr_node *hsr_get_node(struct hsr_port *port, struct list_head *node_db,
> …
> > +/* Get the currently active sequence number block. If there is no block yet, or
> > + * the existing one is expired, a new block is created. The idea is to maintain
> > + * a "sparse bitmap" where a bitmap for the whole sequence number space is
> > + * split into blocks and not all blocks exist all the time. The blocks can
> > + * expire after time (in low traffic situations) or when they are replaced in
> > + * the backing fixed size buffer (in high traffic situations).
> HSR_MAX_SEQ_BLOCKS,
> > + */
> > +static struct hsr_seq_block *hsr_get_seq_block(struct hsr_node *node,
> > + u16 block_idx)
> > +{
> > + struct hsr_seq_block *block, *res;
> > +
> > + block = xa_load(&node->seq_blocks, block_idx);
> > +
> > + if (block && hsr_seq_block_is_old(block)) {
> > + hsr_forget_seq_block(node, block);
> > + block = NULL;
> > + }
> > +
> > + if (!block) {
> > + block = &node->block_buf[node->next_block];
> > + hsr_forget_seq_block(node, block);
> > +
> > + memset(block, 0, sizeof(*block));
> > + block->time = jiffies;
>
> So we assign ->time here while the block is created and it expires after
> HSR_ENTRY_FORGET_TIME (400ms). *Maybe* it should be updated in
> hsr_check_duplicate() if we set a bit. The difference would be that the
> last sequence in the block would get it whole life time while now the
> lifetime starts with the first entry in the block. The downside
> of course would be that the first entry gets more than the initialy
> expected lifetime.
> Not sure if this matters in reality, just wanted to mention.
I considered that as well. But given that the standard states that
"since the 16-bit SeqNr wraps around after 65 536 frames [...], entries
older than a specified time EntryForgetTime are purged" (IEC
62439-3:2021, 4.1.10.3) and later more explicitly that EntryForgetTime
is the "maximum time an entry may reside in the duplicate table" (4.4),
I decided that we rather go the safe way here and use the lifetime of
the first entry.
One can definitely construct a scenario where using the lifetime of the
last entry is problematic: let's say we have a block with a single
entry. If the sequence number wraps around and we reach the same block
again after, let's say, 350ms but hit a different bit in the block, the
new lifetime would apply to both entries, which would almost double the
lifetime of the initial entry.
The large span of 128 entries per timestamp could be mitigated by
changing the size (and maybe number) of blocks. The 64 blocks with 128
entries each is a kind of arbitrary pick, but make a good tradeoff in my
opinion. I'm making that tradeoff between memory usage, timestamp
inaccuracy, and link delay we can deal with (with 64 * 128 = 2^13
entries out of the 2^16 sequence number space, i.e, 1/8th of the space,
that's 5-6ms on a 1Gbit/s network where sequence number could repeat
every 44ms according to the standard).
> > + block->block_idx = block_idx;
> > +
> > + res = xa_store(&node->seq_blocks, block_idx, block, GFP_ATOMIC);
> > + if (xa_is_err(res))
> > + return NULL;
> > +
> > + node->next_block =
> > + (node->next_block + 1) & (HSR_MAX_SEQ_BLOCKS - 1);
> > + }
> > +
> > + return block;
> > +}
> > +
> > /* Use the Supervision frame's info about an eventual macaddress_B for merging
> > * nodes that has previously had their macaddress_B registered as a separate
> > * node.
> > @@ -545,47 +631,26 @@ int prp_register_frame_out(struct hsr_port *port, struct hsr_frame_info *frame)
> …
> > +
> > + seq_bit = hsr_seq_block_bit(sequence_nr);
> > + if (test_and_set_bit(seq_bit, block->seq_nrs))
>
> the access happens under ::seq_out_lock so you don't need to be atomic
> here. You could safe a cycle and use __test_and_set_bit() instead.
Good point, I tried to get rid of the lock at some point, but then
decided that it would be material for another patchset. I'll change it
to __test_and_set_bit().
Thanks,
Felix
> > + goto out_seen;
> >
> > node->time_out[port->type] = jiffies;
> > node->seq_out[port->type] = sequence_nr;
> > +out_new:
> > spin_unlock_bh(&node->seq_out_lock);
> > return 0;
> > +
> > +out_seen:
> > + spin_unlock_bh(&node->seq_out_lock);
> > + return 1;
> > }
> >
> > #if IS_MODULE(CONFIG_PRP_DUP_DISCARD_KUNIT_TEST)
>
> Sebastian
Powered by blists - more mailing lists