[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c512e765-f411-9305-013b-471a07e7f3ff@blackwall.org>
Date: Wed, 30 Mar 2022 19:16:42 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Jakub Kicinski <kuba@...nel.org>,
Alexandra Winter <wintera@...ux.ibm.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>,
Hangbin Liu <liuhangbin@...il.com>, netdev@...r.kernel.org,
linux-s390@...r.kernel.org, Heiko Carstens <hca@...ux.ibm.com>,
Roopa Prabhu <roopa@...dia.com>,
bridge@...ts.linux-foundation.org,
Ido Schimmel <idosch@...dia.com>, Jiri Pirko <jiri@...dia.com>,
Jay Vosburgh <j.vosburgh@...il.com>
Subject: Re: [PATCH net-next v2] veth: Support bonding events
On 30/03/2022 18:51, Jakub Kicinski wrote:
> On Wed, 30 Mar 2022 13:14:12 +0200 Alexandra Winter wrote:
>>>> This patch in no way addresses (2). But then, again, if we put
>>>> a macvlan on top of a bridge master it will shotgun its GARPS all
>>>> the same. So it's not like veth would be special in that regard.
>>>>
>>>> Nik, what am I missing?
>>>
>>> If we're talking about macvlan -> bridge -> bond then the bond flap's
>>> notify peers shouldn't reach the macvlan.
>
> Hm, right. I'm missing a step in my understanding. As you say bridge
> does not seem to be re-broadcasting the event to its master. So how
> does Alexandra catch this kind of an event? :S
>
> case NETDEV_NOTIFY_PEERS:
> /* propagate to peer of a bridge attached veth */
> if (netif_is_bridge_master(dev)) {
>
> IIUC bond will notify with dev == bond netdev. Where is the event with
> dev == br generated?
>
Good question. :)
>>> Generally broadcast traffic
>>> is quite expensive for the bridge, I have patches that improve on the
>>> technical side (consider ports only for the same bcast domain), but you also
>>> wouldn't want unnecessary bcast packets being sent around. :)
>>> There are setups with tens of bond devices and propagating that to all would be
>>> very expensive, but most of all unnecessary. It would also hurt setups with
>>> a lot of vlan devices on the bridge. There are setups with hundreds of vlans
>>> and hundreds of macvlans on top, propagating it up would send it to all of
>>> them and that wouldn't scale at all, these mostly have IP addresses too.
>
> Ack.
>
>>> Perhaps we can enable propagation on a per-port or per-bridge basis, then we
>>> can avoid these walks. That is, make it opt-in.
>
> Maybe opt-out? But assuming the event is only generated on
> active/backup switch over - when would it be okay to ignore
> the notification?
>
Let me just clarify, so I'm sure I've not misunderstood you. Do you mean opt-out as in
make it default on? IMO that would be a problem, large scale setups would suddenly
start propagating it to upper devices which would cause a lot of unnecessary bcast.
I meant enable it only if needed, and only on specific ports (second part is not
necessary, could be global, I think it's ok either way). I don't think any setup
which has many upper vlans/macvlans would ever enable this.
>>>>> It also seems difficult to avoid re-bouncing the notifier.
>>>>
>>>> syzbot will make short work of this patch, I think the potential
>>>> for infinite loops has to be addressed somehow. IIUC this is the
>>>> first instance of forwarding those notifiers to a peer rather
>>>> than within a upper <> lower device hierarchy which is a DAG.
>>
>> My concern was about the Hangbin's alternative proposal to notify all
>> bridge ports. I hope in my porposal I was able to avoid infinite loops.
>
> Possibly I'm confused as to where the notification for bridge master
> gets sent..
IIUC it bypasses the bridge and sends a notify peers for the veth peer so it would
generate a grat arp (inetdev_event -> NETDEV_NOTIFY_PEERS).
Powered by blists - more mailing lists