[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9cc66694-6fcd-4460-9bce-cdbcb0153a89@gmail.com>
Date: Mon, 6 Oct 2025 11:43:02 -0400
From: "Huang, Joseph" <joseph.huang.at.garmin@...il.com>
To: Linus Lüssing <linus.luessing@...3.blue>,
Ido Schimmel <idosch@...dia.com>
Cc: Joseph Huang <Joseph.Huang@...min.com>, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, Andrew Lunn <andrew+netdev@...n.ch>,
Nikolay Aleksandrov <razor@...ckwall.org>, David Ahern <dsahern@...nel.org>,
Stanislav Fomichev <sdf@...ichev.me>, Kuniyuki Iwashima <kuniyu@...gle.com>,
Ahmed Zaki <ahmed.zaki@...el.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
linux-kernel@...r.kernel.org, bridge@...ts.linux.dev
Subject: Re: [PATCH net] net: bridge: Trigger host query on v6 addr valid
On 10/4/2025 10:27 AM, Linus Lüssing wrote:
> On Wed, Sep 17, 2025 at 02:30:51PM +0300, Ido Schimmel wrote:
>> But before making changes, I want to better understand the problem you
>> are seeing. Is it specific to the offloaded data path? I believe the
>> problem was fixed in the software data path by this commit:
>
> Two issues I noticed recently, even without any hardware switch
> offloading, on plain soft bridges:
>
> 1) (Probably not the issue here? But just to avoid that this
> causes additional confusion:) we don't seem to properly converge to
> the lowest MAC address, which is a bug, a violation of the RFCs.
>
> If we received an IGMP/MLD query from a foreign host with an
> address like fe80::2 and selected it and then enable our own
> multicast querier with a lower address like fe80::1 on our bridge
> interface for example then we won't send our queries, won't reelect
> ourself. If I recall correctly. (Not too critical though, as at least we
> have a querier on the link. But I find the election code a bit
> confusing and I wouldn't dare to touch it without adding some tests.)
>
I agree that there might be some corner cases which the current election
code does not handle very well (one of them is outlined below).
> 2) Without Ido's suggested workaround when the bridge multicast snooping
> + querier is enabled before the IPv6 DAD has taken place then our
> first IGMP/MLD query will fizzle, not be transmitted.
This (#2) is what this patch trying to address. With DAD enabled, the
first MLD Query is never transmitted. That essentially means that the
Robustness Variable is 1 (which is not very robust).
> However (at least for a non-hardware-offloaded) bridge as far as I
> recall this shouldn't create any multicast packet loss and should
> operate as "normal" with flooding multicast data packets first,
> with multicast snooping activating on multicast data
> after another IGMP/MLD querier interval has elapsed (default:
> 125 sec.)?
>
Some systems could not afford to flood multicast traffic. Think of some
resource-constrained low power sensors connected to a network with high
volume multicast video traffic for example. The multicast traffic could
easily choke the sensors and is essentially a DDoS attack.
> Which indeed could be optimized and is confusing, this delay could
> be avoided. Is that that the issue you mean, Joseph?
> (I'd consider it more an optimization, so for net-next, not
> net though.)
>
I'm not sure this should be categorized as an optimization. If we never
intend to send Startup Queries, that's a different story. But if we
intend to send it but failed, I think that should be a bug.
>> In current implementation, :: always wins the election
>
> That would be news to me.
>
> RFC2710, section 5:
>
> To be valid, the Query message MUST come from a link-
> local IPv6 Source Address
>
> RFC3810, section 5.1.14, is even more explicit:
>
> 5.1.14. Source Addresses for Queries
>
> All MLDv2 Queries MUST be sent with a valid IPv6 link-local source
> address. If a node (router or host) receives a Query message with
> the IPv6 Source Address set to the unspecified address (::), or any
> other address that is not a valid IPv6 link-local address, it MUST
> silently discard the message and SHOULD log a warning.
>
> So :: can't be used as a source address for an MLD query.
> And since 2014 with "bridge: multicast: add sanity check for query source addresses"
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6565b9eeef194afbb3beec80d6dd2447f4091f8c)
> we should be adhering to that requirement? Let me know if I'm missing
> something.
>
This is what I meant by ":: always wins":
In br_multicast_select_querier(),
if (ipv6_addr_cmp(&saddr->src.ip6, &querier->addr.src.ip6) <= 0)
goto update;
If querier->addr.src.ip6 is 0, nothing can be less than that, so "::
always wins".
However,
1. querier->addr.src.ip6 is (un)initialized(?) to 0 (I couldn't find the
place where ip6_querier.addr is initialized)
2. Querier election cannot take place due to the comparison above, until
the bridge selects itself first via br_multicast_select_own_querier()
3. the bridge only selects itself after the first successful Query is
sent to the host
4. br_ip6_multicast_alloc_query() will fail if v6 address is not valid
So, without this patch a system would have to wait for
31.25 seconds (for the second Query to the host to selects itself) +
~125 seconds (for the next Query from the real Querier to arrive)
in order to receive multicast traffic. For some embedded devices that's
a very long time (imagine turning on a TV and have to wait for 2 minutes
and a half before it starts working).
Thanks,
Joseph
> For IPv4 and 0.0.0.0 this is a different story though... I'm not
> aware of a requirement in RFCs to avoid 0.0.0.0 in IGMP
> queries. And "intuitively" one would prefer 0.0.0.0 to be the
> least prefered querier address. But when taking the IGMP RFCs
> literally then 0.0.0.0 would be the lowest one and always win... And RFC4541
> unfortunately does not clarify the use of 0.0.0.0 for IGMP queries.
> Not quite sure what the common practice among other layer 2 multicast
> snooping implemetations across other vendos is.
>
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0888d5f3c0f183ea6177355752ada433d370ac89
>>
>> And Linus is working [1][2] on reflecting it to device drivers so that
>> the hardware data path will act like the software data path and flood
>> unregistered multicast traffic to all the ports as long as no querier
>> was detected.
>
> Right, for hardware offloading bridges/switches I'm on it, next
> revision shouldn't take much longer...
>
> Regards, Linus
Powered by blists - more mailing lists