[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLQxH0H_cPcZnxO9ni73ncmbbhx3knzRB2swTsx=J-Fmg@mail.gmail.com>
Date: Thu, 5 Sep 2024 15:26:27 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org, eric.dumazet@...il.com,
syzbot <syzkaller@...glegroups.com>
Subject: Re: [PATCH net] net: hsr: remove seqnr_lock
On Thu, Sep 5, 2024 at 3:18 PM Sebastian Andrzej Siewior
<bigeasy@...utronix.de> wrote:
>
> On 2024-09-05 14:26:30 [+0200], Eric Dumazet wrote:
> > On Thu, Sep 5, 2024 at 2:17 PM Sebastian Andrzej Siewior
> > <bigeasy@...utronix.de> wrote:
> > >
> > > On 2024-09-04 13:37:25 [+0000], Eric Dumazet wrote:
> > > > syzbot found a new splat [1].
> > > >
> > > > Instead of adding yet another spin_lock_bh(&hsr->seqnr_lock) /
> > > > spin_unlock_bh(&hsr->seqnr_lock) pair, remove seqnr_lock
> > > > and use atomic_t for hsr->sequence_nr and hsr->sup_sequence_nr.
> > > >
> > > > This also avoid a race in hsr_fill_info().
> > >
> > > You obtain to sequence nr without locking so two CPUs could submit skbs
> > > at the same time. Wouldn't this allow the race I described in commit
> > > 06afd2c31d338 ("hsr: Synchronize sending frames to have always incremented outgoing seq nr.")
> > >
> > > to happen again? Then one skb would be dropped while sending because it
> > > has lower sequence nr but in fact it was not yet sent.
> > >
> >
> > A network protocol unable to cope with reorders can not live really.
> >
> > If this is an issue, this should be fixed at the receiving side.
>
> The standard/ network protocol just says there has to be seq nr to avoid
> processing a packet multiple times. The Linux implementation increments
> the counter and assumes everything lower than the current counter has
> already been seen. Therefore the sequence lock is held during the entire
> sending process.
> This is not a protocol but implementation issue ;)
These packets are sent on a physical network, reorders are inevitable.
> I am aware of a FPGA implementation of HSR which tracks the last 20
> sequence numbers instead. This would help because it would allow
> reorders to happen.
>
> Looking at it, the code chain never held the lock while I was playing
> with it and I did not see this. So this might be just a consequence of
> using gro here. I don't remember disabling it so it must have been of by
> default or syzbot found a way to enable it (or has better hardware).
>
> Would it make sense to disable this for HSR interfaces?
This has nothing to do with GRO.
Look at this alternative patch, perhaps you will see the problem ?
diff --git a/net/hsr/hsr_slave.c b/net/hsr/hsr_slave.c
index af6cf64a00e081c777db5f7786e8a27ea6f62e14..3971dbc0644ab8d32c04c262dbba7b1c950ebea9
100644
--- a/net/hsr/hsr_slave.c
+++ b/net/hsr/hsr_slave.c
@@ -67,7 +67,9 @@ static rx_handler_result_t hsr_handle_frame(struct
sk_buff **pskb)
skb_set_network_header(skb, ETH_HLEN + HSR_HLEN);
skb_reset_mac_len(skb);
+ spin_lock_bh(&hsr->seqnr_lock);
hsr_forward_skb(skb, port);
+ spin_unlock_bh(&hsr->seqnr_lock);
finish_consume:
return RX_HANDLER_CONSUMED;
I am surprised we even have a discussion considering HSR has Orphan
status in MAINTAINERS...
I do not know how to test HSR, I am not sure the alternative patch is correct.
Removing the seqnr_lock seems the safest to me.
Powered by blists - more mailing lists