[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3dceb664-0dd5-d46b-2431-b235cbd7752f@quietfountain.com>
Date: Tue, 11 Jul 2023 15:22:26 -0500
From: Harry Coin <hcoin@...etfountain.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: netdev@...r.kernel.org
Subject: Re: llc needs namespace awareness asap, was Re: Patch fixing STP if
bridge in non-default namespace.
On 7/11/23 13:32, Kuniyuki Iwashima wrote:
> From: Harry Coin <hcoin@...etfountain.com>
> Date: Tue, 11 Jul 2023 12:08:15 -0500
>> On 7/10/23 22:22, Kuniyuki Iwashima wrote:
>>> From: Harry Coin <hcoin@...etfountain.com>
>>> Date: Mon, 10 Jul 2023 08:35:08 -0500
>>>> Notice without access to link-level multicast address 01:80:C2:00:00:00,
>>>> the STP loop-avoidance feature of bridges fails silently, leading to
>>>> packet storms if loops exist in the related L2. The Linux kernel's
>>>> latest code silently drops BPDU STP packets if the bridge is in a
>>>> non-default namespace.
>>>>
>>>> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
>>>>
>>>> if (!net_eq(dev_net(dev), &init_net))
>>>> goto drop;
>>>>
>>>> Which, when commented out, fixes this bug. A search on &init_net may
>>>> reveal many similar artifacts left over from the early days of namespace
>>>> implementation.
>>> I think just removing the part is not sufficient and will introduce a bug
>>> in another place.
>>>
>>> As you found, llc has the same test in another place. For example, when
>>> you create an AF_LLC socket, it has to be in the root netns. But if you
>>> remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
>>> a child netns into sk's recv queue that is in the default netns.
>>>
>>> - llc_rcv
>>> - if (net_eq(dev_net(dev), &init_net))
>>> - goto drop
>>> - sap_handler / llc_sap_handler
>>> - sk = llc_lookup_dgram
>>> - llc_sap_rcv
>>> - llc_sap_state_process
>>> - sock_queue_rcv_skb
>>>
>>> So, we need to namespacify the whole llc infra.
>> Agreed. Probably sooner rather than later since IP4 and IP6 multicast,
>> GARP and more as well as STP depends on llc multicast delivery. I
>> suspect the authors who added the 'drop unless default namespace' code
>> commented out above knew this, and were just buying some time. Well,
>> the time has come.
>>
>> Now all bridges in a namespace will always -- and silently -- think of
>> itself as the 'root bridge' as it can't get packets informing it
>> otherwise. This leads to packet storms at line-level speeds bringing
>> whole infrastructures down in a self-inflicted event worse than a DDOS
>> attack.
>>
>> I think whoever does 'advisories' ought to warn the community that ipv6
>> ndp (if using multicast), ipv4 arp (if using multicast), bridges with
>> STP, lldp, GARP, ipv6 multicast and ipv4 mulitcast for sockets in the
>> non-default namespace will not get RX traffic as it gets dropped in the
>> kernel before other modules or user code has a chance to see it.
>> Outcomes range from local seeming disconnection to kernel induced
>> site-crippling packet storms.
>>
>> Is there a way to track this llc namespace awareness effort? I'm new to
>> this particular dev community. It's on a critical path for my project.
> AFAIK, there is no ongoing work for this. I can spend some cycles on
> this, but note that the patches might not be backported to stable as
> it would be invasive.
Thank you! When you offer your patches, and you hear worries about
being 'invasive', it's worth asking 'compared to what' -- since the
'status quo' is every bridge with STP in a non default namespace with a
loop in it somewhere will freeze every connected system more solid than
ice in Antarctica.
Powered by blists - more mailing lists