[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220323115419.svxnbcqqd7pyargn@skbuf>
Date: Wed, 23 Mar 2022 13:54:19 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Hans Schultz <schultz.hans@...il.com>
Cc: Andrew Lunn <andrew@...n.ch>, davem@...emloft.net, kuba@...nel.org,
netdev@...r.kernel.org, Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
Ivan Vecera <ivecera@...hat.com>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Daniel Borkmann <daniel@...earbox.net>,
Ido Schimmel <idosch@...dia.com>, linux-kernel@...r.kernel.org,
bridge@...ts.linux-foundation.org
Subject: Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB
implementation
On Wed, Mar 23, 2022 at 12:43:03PM +0100, Hans Schultz wrote:
> On ons, mar 23, 2022 at 13:21, Vladimir Oltean <olteanv@...il.com> wrote:
> > On Wed, Mar 23, 2022 at 11:57:16AM +0100, Hans Schultz wrote:
> >> >> >> Another issue I see, is that there is a deadlock or similar issue when
> >> >> >> receiving violations and running 'bridge fdb show' (it seemed that
> >> >> >> member violations also caused this, but not sure yet...), as the unit
> >> >> >> freezes, not to return...
> >> >> >
> >> >> > Have you enabled lockdep, debug atomic sleep, detect hung tasks, things
> >> >> > like that?
> >> >>
> >> >> I have now determined that it is the rtnl_lock() that causes the
> >> >> "deadlock". The doit() in rtnetlink.c is under rtnl_lock() and is what
> >> >> takes care of getting the fdb entries when running 'bridge fdb show'. In
> >> >> principle there should be no problem with this, but I don't know if some
> >> >> interrupt queue is getting jammed as they are blocked from rtnetlink.c?
> >> >
> >> > Sorry, I forgot to respond yesterday to this.
> >> > By any chance do you maybe have an AB/BA lock inversion, where from the
> >> > ATU interrupt handler you do mv88e6xxx_reg_lock() -> rtnl_lock(), while
> >> > from the port_fdb_dump() handler you do rtnl_lock() -> mv88e6xxx_reg_lock()?
> >>
> >> If I release the mv88e6xxx_reg_lock() before calling the handler, I need
> >> to get it again for the mv88e6xxx_g1_atu_loadpurge() call at least. But
> >> maybe the vtu_walk also needs the mv88e6xxx_reg_lock()?
> >> I could also just release the mv88e6xxx_reg_lock() before the
> >> call_switchdev_notifiers() call and reacquire it immediately after?
> >
> > The cleanest way to go about this would be to have the call_switchdev_notifiers()
> > portion of the ATU interrupt handling at the very end of mv88e6xxx_g1_atu_prob_irq_thread_fn(),
> > with no hardware access needed, and therefore no reg_lock() held.
>
> So something like?
> mv88e6xxx_reg_unlock(chip);
> rtnl_lock();
> err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, brport, &info.info, NULL);
> rtnl_unlock();
> mv88e6xxx_reg_lock(chip);
No, call_switchdev_notifiers() should be the very end, no reg_lock() afterwards.
Do all the hardware handling you need, populate some variables to denote
that you need to notify switchdev, and if you do, lock the rtnetlink
mutex and do it.
Powered by blists - more mailing lists