lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201109003028.melbgstk4pilxksl@skbuf>
Date:   Mon, 9 Nov 2020 02:30:28 +0200
From:   Vladimir Oltean <olteanv@...il.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     DENG Qingfang <dqfext@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Jakub Kicinski <kuba@...nel.org>,
        netdev <netdev@...r.kernel.org>, linux-kernel@...r.kernel.org,
        Tobias Waldekranz <tobias@...dekranz.com>,
        Marek Behun <marek.behun@....cz>,
        Russell King - ARM Linux admin <linux@...linux.org.uk>
Subject: Re: [RFC PATCH net-next 3/3] net: dsa: listen for
 SWITCHDEV_{FDB,DEL}_ADD_TO_DEVICE on foreign bridge neighbors

On Mon, Nov 09, 2020 at 12:59:39AM +0100, Andrew Lunn wrote:
> On Sun, Nov 08, 2020 at 07:23:55PM +0200, Vladimir Oltean wrote:
> > On Sun, Nov 08, 2020 at 10:09:25PM +0800, DENG Qingfang wrote:
> > > Can it be turned off for switches that support SA learning from CPU?
> > 
> > Is there a good reason I would add another property per switch and not
> > just do it unconditionally?
> 
> Just throwing out ideas, i've no idea if they are relevant. I wonder
> if we can get into issues with fast ageing with a topology change?  We
> don't have too much control over the hardware. I think some devices
> just flush everything, or maybe just one port. So we have different
> life times for CPU port database entries and user port database
> entries?

A quick scan for "port_fast_age" did not find any implementers who do
not act upon the "port" argument.

> We might also run into bugs with flushing removing static database
> entries which should not be. But that would be a bug.

I can imagine that happening, when there are multiple bridges spanning a
DSA switch, each bridge also contains a "foreign" interface, and the 2
bridging domains service 2 stations that have the same MAC address. In
that case, since the fdb_add and fdb_del are not reference-counted on
the shared DSA CPU port, we would indeed trigger this bug.

I was on the fence on whether to include the reference counting patch I
have for host MDBs, and to make these addresses refcounted as well.
What do you think?

> Also, dumping the database might run into bugs since we have not had
> entries for the CPU port before.

I don't see what conditions can make this happen.

> We also need to make sure the static entries get removed correctly
> when a host moves. The mv88e6xxx will not replace a static entry with
> a dynamically learned one. It will probably rise an ATU violation
> interrupt that frames have come in the wrong port.

This is a good one. Currently every implementer of .port_fdb_add assumes
a static entry is what we want, but that is not the case here. We want
an entry that can expire or the switch can move it to a different port
when there is evidence that it's wrong. Should we add more arguments to
the API?

> What about switches which do not implement port_fdb_add? Do these
> patches at least do something sensible?

dsa_slave_switchdev_event
-> dsa_slave_switchdev_event_work
   -> dsa_port_fdb_add
      -> dsa_port_notify(DSA_NOTIFIER_FDB_ADD)
         -> dsa_switch_fdb_add
            -> if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
   -> an error is printed with dev_dbg, and
      dsa_fdb_offload_notify(switchdev_work) is not called.

On dsa_port_fdb_del error, there is also an attempt to call dev_close()
on error, but only on user ports, which the CPU port is not.

So, we do something almost sensible, but mostly by mistake it seems.

I think the simplest would be to simply avoid all this nonsense right
away in dsa_slave_switchdev_event:

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -2188,6 +2188,9 @@ static int dsa_slave_switchdev_event(struct notifier_block *unused,
 			dp = p->dp->cpu_dp;
 		}
 
+		if (!dp->ds->ops->port_fdb_add || !dp->ds->ops->port_fdb_del)
+			return NOTIFY_DONE;
+
 		switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
 		if (!switchdev_work)
 			return NOTIFY_BAD;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ