[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f4f1b9b4-abc9-842a-205a-35588916115d@quietfountain.com>
Date: Thu, 13 Jul 2023 08:50:44 -0500
From: Harry Coin <hcoin@...etfountain.com>
To: Nikolay Aleksandrov <razor@...ckwall.org>,
Ido Schimmel <idosch@...sch.org>, Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: Roopa Prabhu <roopa@...dia.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, "Eric W. Biederman"
<ebiederm@...ssion.com>, Kuniyuki Iwashima <kuni1840@...il.com>,
netdev@...r.kernel.org, bridge@...ts.linux-foundation.org
Subject: Re: [PATCH v1 net] bridge: Return an error when enabling STP in
netns.
On 7/12/23 09:52, Nikolay Aleksandrov wrote:
> On 12/07/2023 17:48, Ido Schimmel wrote:
>> On Tue, Jul 11, 2023 at 04:54:15PM -0700, Kuniyuki Iwashima wrote:
>>> When we create an L2 loop on a bridge in netns, we will see packets storm
>>> even if STP is enabled.
>>>
>>> # unshare -n
>>> # ip link add br0 type bridge
>>> # ip link add veth0 type veth peer name veth1
>>> # ip link set veth0 master br0 up
>>> # ip link set veth1 master br0 up
>>> # ip link set br0 type bridge stp_state 1
>>> # ip link set br0 up
>>> # sleep 30
>>> # ip -s link show br0
>>> 2: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
>>> link/ether b6:61:98:1c:1c:b5 brd ff:ff:ff:ff:ff:ff
>>> RX: bytes packets errors dropped missed mcast
>>> 956553768 12861249 0 0 0 12861249 <-. Keep
>>> TX: bytes packets errors dropped carrier collsns | increasing
>>> 1027834 11951 0 0 0 0 <-' rapidly
>>>
>>> This is because llc_rcv() drops all packets in non-root netns and BPDU
>>> is dropped.
>>>
>>> Let's show an error when enabling STP in netns.
>>>
>>> # unshare -n
>>> # ip link add br0 type bridge
>>> # ip link set br0 type bridge stp_state 1
>>> Error: bridge: STP can't be enabled in non-root netns.
>>>
>>> Note this commit will be reverted later when we namespacify the whole LLC
>>> infra.
>>>
>>> Fixes: e730c15519d0 ("[NET]: Make packet reception network namespace safe")
>>> Suggested-by: Harry Coin <hcoin@...etfountain.com>
>> I'm not sure that's accurate. I read his response in the link below and
>> he says "I'd rather be warned than blocked" and "But better warned and
>> awaiting a fix than blocked", which I agree with. The patch has the
>> potential to cause a lot of regressions, but without actually fixing the
>> problem.
>>
>> How about simply removing the error [1]? Since iproute2 commit
>> 844c37b42373 ("libnetlink: Handle extack messages for non-error case"),
>> it can print extack warnings and not only errors. With the diff below:
>>
>> # unshare -n
>> # ip link add name br0 type bridge
>> # ip link set dev br0 type bridge stp_state 1
>> Warning: bridge: STP can't be enabled in non-root netns.
>> # echo $?
>> 0
>>
>> [1]
>> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
>> index a807996ac56b..b5143de37938 100644
>> --- a/net/bridge/br_stp_if.c
>> +++ b/net/bridge/br_stp_if.c
>> @@ -201,10 +201,8 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
>> {
>> ASSERT_RTNL();
>>
>> - if (!net_eq(dev_net(br->dev), &init_net)) {
>> + if (!net_eq(dev_net(br->dev), &init_net))
>> NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled in non-root netns");
>> - return -EINVAL;
>> - }
>>
>> if (br_mrp_enabled(br)) {
>> NL_SET_ERR_MSG_MOD(extack,
>>
> I'd prefer this approach to changing user-visible behaviour and potential regressions.
> Just change the warning message.
>
> Thanks,
> Nik
Remember, the only way it's honest to 'warn but not block' STP in netns
is trust in the Kuniyuki's assertion that the llc will be
'namespacified' in a near term frame. As STP is not only
non-functional in a netns, but will in fact bring down every connected
system in a packet storm should a L2 loop exist the situation is much
worse than a merely inaccessible extra feature. This as the only reason
STP exists is to avoid crashing sites owing to packet storms arising
from L2 loops. I think as this bug is a potential 'site killer'
(which in fact happened to me!) The Linux dev community has an
obligation to either hard block this or commit to a fix time frame and
merely warn.
Thanks
Harry Coin
Powered by blists - more mailing lists