[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <538B19A9.4050607@mojatatu.com>
Date: Sun, 01 Jun 2014 08:16:41 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: davem@...emloft.net, stephen@...workplumber.org
CC: netdev@...r.kernel.org, vyasevic@...hat.com,
sfeldma@...ulusnetworks.com, john.r.fastabend@...el.com,
roopa@...ulusnetworks.com
Subject: Re: [net-next PATCH 2/2] bridge netlink dump interface at par with
brctl Actually better than brctl showmacs because we can filter by bridge
port in the kernel
This is mostly to you Vlad since you brought it up earlier.
I ended using ifm instead of ndm. Currently there is lack of
symettry - we send requests with ifm and get responses with
ndms. Unfortunately after spending 2-3 hours I came to the
conclusion i cant change it without breaking old iproute2s that
were expecting this behavior. What we have here is a magnitude
better filtering but we could have done slightly better if we
were able to use an ndm. A little acrobatics later on to filter
by vlans may work..
cheers,
jamal
On 06/01/14 07:56, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@...atatu.com>
>
> The current bridge netlink interface doesnt scale when you have many bridges each
> with large fdbs or even bridges with many bridge ports
>
> Example usage:
>
> Lets start with two bridges each with a port...
>
> root@...a-mojo:bridge# ./bridge link
> 8: eth1 state DOWN : <BROADCAST,MULTICAST> mtu 1500 master br0 state disabled priority 32 cost 19
> 17: sw1-p1 state DOWN : <BROADCAST,NOARP> mtu 1500 master sw1 state disabled priority 32 cost 100
>
> show all...
> root@...a-mojo:bridge# ./bridge fdb show
> 33:33:00:00:00:01 dev bond0 self permanent
> 33:33:00:00:00:01 dev dummy0 self permanent
> 33:33:00:00:00:01 dev ifb0 self permanent
> 33:33:00:00:00:01 dev ifb1 self permanent
> 33:33:00:00:00:01 dev eth0 self permanent
> 01:00:5e:00:00:01 dev eth0 self permanent
> 33:33:ff:22:01:01 dev eth0 self permanent
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
> 33:33:00:00:00:01 dev gretap0 self permanent
> 33:33:00:00:00:01 dev br0 self permanent
> 33:33:00:00:00:01 dev sw1 self permanent
> a2:fb:21:4c:47:25 dev sw1-p1 vlan 0 master sw1 permanent
> 33:33:00:00:00:01 dev sw1-p1 self permanent
>
> Lets see a port that is not attached to a bridge
> root@...a-mojo:bridge# ./bridge fdb show brport eth0
> 33:33:00:00:00:01 self permanent
> 01:00:5e:00:00:01 self permanent
> 33:33:ff:22:01:01 self permanent
>
> Lets see a port that is attached to a bridge
> root@...a-mojo:bridge# ./bridge fdb show brport eth1
> 02:00:00:12:01:02 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 self permanent
> 33:33:00:00:00:01 self permanent
>
> Specify the correct bridge and you get good stuff
> root@...a-mojo:bridge# ./bridge fdb show brport eth1 br br0
> 02:00:00:12:01:02 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 self permanent
> 33:33:00:00:00:01 self permanent
>
> Specify the wrong bridge and you get good nada
> root@...a-mojo:bridge# ./bridge fdb show brport eth1 br sw1
>
> dump only br0
> root@...a-mojo:bridge# ./bridge fdb show br br0
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
>
> Lets move a port from one bridge to another for shits-and-giggles
> (as they say in New Brunswick)
> root@...a-mojo:bridge# ip link set sw1-p1 master br0
>
> Now dump again br0
> root@...a-mojo:bridge# ./bridge fdb show br br0
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
> a2:fb:21:4c:47:25 dev sw1-p1 vlan 0 master br0 permanent
> 33:33:00:00:00:01 dev sw1-p1 self permanent
>
> Signed-off-by: Jamal Hadi Salim <jhs@...atatu.com>
> ---
> net/core/rtnetlink.c | 68 +++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 56 insertions(+), 12 deletions(-)
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 064418e..71e6bc8 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2508,26 +2508,70 @@ EXPORT_SYMBOL(ndo_dflt_fdb_dump);
>
> static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
> {
> - int idx = 0;
> - struct net *net = sock_net(skb->sk);
> struct net_device *dev;
> + struct net_device *br_dev;
> + struct nlattr *tb[IFLA_MAX+1];
> + const struct net_device_ops *ops;
> + struct ifinfomsg *ifm = nlmsg_data(cb->nlh);
> + struct net *net = sock_net(skb->sk);
> + int brport_idx = 0;
> + int br_idx = 0;
> + int idx = 0;
> +
> + if (nlmsg_parse(cb->nlh, sizeof(struct ifinfomsg), tb, IFLA_MAX,
> + ifla_policy) == 0) {
> + if (tb[IFLA_MASTER])
> + br_idx = nla_get_u32(tb[IFLA_MASTER]);
> + }
> +
> + brport_idx = ifm->ifi_index;
>
> rcu_read_lock();
> for_each_netdev_rcu(net, dev) {
> - if (dev->priv_flags & IFF_BRIDGE_PORT) {
> - struct net_device *br_dev;
> - const struct net_device_ops *ops;
>
> - br_dev = netdev_master_upper_dev_get(dev);
> + if (brport_idx && (dev->ifindex != brport_idx))
> + continue;
> +
> + if (!br_idx) {
> + if (dev->priv_flags & IFF_BRIDGE_PORT) {
> + br_dev = netdev_master_upper_dev_get(dev);
> + ops = br_dev->netdev_ops;
> + if (ops->ndo_fdb_dump)
> + idx = ops->ndo_fdb_dump(skb, cb, br_dev,
> + dev, idx);
> + }
> +
> + /* all of bridge fdb entries are dumped via brports fdb
> + * therefore only allow for selfies for bridges
> + */
> + if (!(dev->priv_flags & IFF_EBRIDGE) &&
> + dev->netdev_ops->ndo_fdb_dump)
> + idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev,
> + NULL, idx);
> + else
> + idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> +
> + } else {
> + if (!(dev->priv_flags & IFF_BRIDGE_PORT))
> + continue;
> +
> + br_dev = __dev_get_by_index(net, br_idx);
> + if (!br_dev)
> + return -ENODEV;
> +
> + if (br_dev != netdev_master_upper_dev_get(dev))
> + continue;
> +
> ops = br_dev->netdev_ops;
> if (ops->ndo_fdb_dump)
> - idx = ops->ndo_fdb_dump(skb, cb, dev, NULL, idx);
> - }
> + idx = ops->ndo_fdb_dump(skb, cb, br_dev, dev, idx);
>
> - if (dev->netdev_ops->ndo_fdb_dump)
> - idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev, NULL, idx);
> - else
> - idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> + if (dev->netdev_ops->ndo_fdb_dump)
> + idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev,
> + NULL, idx);
> + else
> + idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> + }
> }
> rcu_read_unlock();
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists