lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 Mar 2022 12:16:43 +0200
From:   Vladimir Oltean <olteanv@...il.com>
To:     Hans Schultz <schultz.hans@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>, davem@...emloft.net, kuba@...nel.org,
        netdev@...r.kernel.org, Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Ivan Vecera <ivecera@...hat.com>,
        Roopa Prabhu <roopa@...dia.com>,
        Nikolay Aleksandrov <razor@...ckwall.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Ido Schimmel <idosch@...dia.com>, linux-kernel@...r.kernel.org,
        bridge@...ts.linux-foundation.org
Subject: Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB
 implementation

On Wed, Mar 23, 2022 at 11:13:51AM +0100, Hans Schultz wrote:
> On tis, mar 22, 2022 at 13:08, Vladimir Oltean <olteanv@...il.com> wrote:
> > On Tue, Mar 22, 2022 at 12:01:13PM +0100, Hans Schultz wrote:
> >> On fre, mar 18, 2022 at 15:19, Vladimir Oltean <olteanv@...il.com> wrote:
> >> > On Fri, Mar 18, 2022 at 02:10:26PM +0100, Hans Schultz wrote:
> >> >> In the offloaded case there is no difference between static and dynamic
> >> >> flags, which I see as a general issue. (The resulting ATU entry is static
> >> >> in either case.)
> >> >
> >> > It _is_ a problem. We had the same problem with the is_local bit.
> >> > Independently of this series, you can add the dynamic bit to struct
> >> > switchdev_notifier_fdb_info and make drivers reject it.
> >> >
> >> >> These FDB entries are removed when link goes down (soft or hard). The
> >> >> zero DPV entries that the new code introduces age out after 5 minutes,
> >> >> while the locked flagged FDB entries are removed by link down (thus the
> >> >> FDB and the ATU are not in sync in this case).
> >> >
> >> > Ok, so don't let them disappear from hardware, refresh them from the
> >> > driver, since user space and the bridge driver expect that they are
> >> > still there.
> >> 
> >> I have now tested with two extra unmanaged switches (each connected to a
> >> seperate port on our managed switch, and when migrating from one port to
> >> another, there is member violations, but as the initial entry ages out,
> >> a new miss violation occurs and the new port adds the locked entry. In
> >> this case I only see one locked entry, either on the initial port or
> >> later on the port the host migrated to (via switch).
> >> 
> >> If I refresh the ATU entries indefinitly, then this migration will for
> >> sure not work, and with the member violation suppressed, it will be
> >> silent about it.
> >
> > Manual says that migrations should trigger miss violations if configured
> > adequately, is this not the case?
> >
> >> So I don't think it is a good idea to refresh the ATU entries
> >> indefinitely.
> >> 
> >> Another issue I see, is that there is a deadlock or similar issue when
> >> receiving violations and running 'bridge fdb show' (it seemed that
> >> member violations also caused this, but not sure yet...), as the unit
> >> freezes, not to return...
> >
> > Have you enabled lockdep, debug atomic sleep, detect hung tasks, things
> > like that?
> 
> I have now determined that it is the rtnl_lock() that causes the
> "deadlock". The doit() in rtnetlink.c is under rtnl_lock() and is what
> takes care of getting the fdb entries when running 'bridge fdb show'. In
> principle there should be no problem with this, but I don't know if some
> interrupt queue is getting jammed as they are blocked from rtnetlink.c?

Sorry, I forgot to respond yesterday to this.
By any chance do you maybe have an AB/BA lock inversion, where from the
ATU interrupt handler you do mv88e6xxx_reg_lock() -> rtnl_lock(), while
from the port_fdb_dump() handler you do rtnl_lock() -> mv88e6xxx_reg_lock()?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ