[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230711183206.54744-1-kuniyu@amazon.com>
Date: Tue, 11 Jul 2023 11:32:06 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <hcoin@...etfountain.com>
CC: <kuniyu@...zon.com>, <netdev@...r.kernel.org>
Subject: Re: llc needs namespace awareness asap, was Re: Patch fixing STP if bridge in non-default namespace.
From: Harry Coin <hcoin@...etfountain.com>
Date: Tue, 11 Jul 2023 12:08:15 -0500
> On 7/10/23 22:22, Kuniyuki Iwashima wrote:
> > From: Harry Coin <hcoin@...etfountain.com>
> > Date: Mon, 10 Jul 2023 08:35:08 -0500
> >> Notice without access to link-level multicast address 01:80:C2:00:00:00,
> >> the STP loop-avoidance feature of bridges fails silently, leading to
> >> packet storms if loops exist in the related L2. The Linux kernel's
> >> latest code silently drops BPDU STP packets if the bridge is in a
> >> non-default namespace.
> >>
> >> The current llc_rcv.c around line 166 in net/llc/llc_input.c has
> >>
> >> if (!net_eq(dev_net(dev), &init_net))
> >> goto drop;
> >>
> >> Which, when commented out, fixes this bug. A search on &init_net may
> >> reveal many similar artifacts left over from the early days of namespace
> >> implementation.
> > I think just removing the part is not sufficient and will introduce a bug
> > in another place.
> >
> > As you found, llc has the same test in another place. For example, when
> > you create an AF_LLC socket, it has to be in the root netns. But if you
> > remove the test in llc_rcv() only, it seems llc_recv() would put a skb for
> > a child netns into sk's recv queue that is in the default netns.
> >
> > - llc_rcv
> > - if (net_eq(dev_net(dev), &init_net))
> > - goto drop
> > - sap_handler / llc_sap_handler
> > - sk = llc_lookup_dgram
> > - llc_sap_rcv
> > - llc_sap_state_process
> > - sock_queue_rcv_skb
> >
> > So, we need to namespacify the whole llc infra.
>
> Agreed. Probably sooner rather than later since IP4 and IP6 multicast,
> GARP and more as well as STP depends on llc multicast delivery. I
> suspect the authors who added the 'drop unless default namespace' code
> commented out above knew this, and were just buying some time. Well,
> the time has come.
>
> Now all bridges in a namespace will always -- and silently -- think of
> itself as the 'root bridge' as it can't get packets informing it
> otherwise. This leads to packet storms at line-level speeds bringing
> whole infrastructures down in a self-inflicted event worse than a DDOS
> attack.
>
> I think whoever does 'advisories' ought to warn the community that ipv6
> ndp (if using multicast), ipv4 arp (if using multicast), bridges with
> STP, lldp, GARP, ipv6 multicast and ipv4 mulitcast for sockets in the
> non-default namespace will not get RX traffic as it gets dropped in the
> kernel before other modules or user code has a chance to see it.
> Outcomes range from local seeming disconnection to kernel induced
> site-crippling packet storms.
>
> Is there a way to track this llc namespace awareness effort? I'm new to
> this particular dev community. It's on a critical path for my project.
AFAIK, there is no ongoing work for this. I can spend some cycles on
this, but note that the patches might not be backported to stable as
it would be invasive.
Powered by blists - more mailing lists