lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YSHzLKpixhCrrgJ0@shredder>
Date:   Sun, 22 Aug 2021 09:48:12 +0300
From:   Ido Schimmel <idosch@...sch.org>
To:     Nikolay Aleksandrov <nikolay@...dia.com>
Cc:     Vladimir Oltean <olteanv@...il.com>,
        Vladimir Oltean <vladimir.oltean@....com>,
        netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Roopa Prabhu <roopa@...dia.com>, Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Vadym Kochan <vkochan@...vell.com>,
        Taras Chornyi <tchornyi@...vell.com>,
        Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
        UNGLinuxDriver@...rochip.com,
        Grygorii Strashko <grygorii.strashko@...com>,
        Marek Behun <kabel@...ckhole.sk>,
        DENG Qingfang <dqfext@...il.com>,
        Kurt Kanzenbach <kurt@...utronix.de>,
        Hauke Mehrtens <hauke@...ke-m.de>,
        Woojung Huh <woojung.huh@...rochip.com>,
        Sean Wang <sean.wang@...iatek.com>,
        Landen Chao <Landen.Chao@...iatek.com>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        George McCollister <george.mccollister@...il.com>,
        Ioana Ciornei <ioana.ciornei@....com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Leon Romanovsky <leon@...nel.org>,
        Lars Povlsen <lars.povlsen@...rochip.com>,
        Steen Hegelund <Steen.Hegelund@...rochip.com>,
        Julian Wiedmann <jwi@...ux.ibm.com>,
        Karsten Graul <kgraul@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Ivan Vecera <ivecera@...hat.com>,
        Vlad Buslov <vladbu@...dia.com>,
        Jianbo Liu <jianbol@...dia.com>,
        Mark Bloch <mbloch@...dia.com>, Roi Dayan <roid@...dia.com>,
        Tobias Waldekranz <tobias@...dekranz.com>,
        Vignesh Raghavendra <vigneshr@...com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>
Subject: Re: [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
 blocking

On Sat, Aug 21, 2021 at 02:36:26AM +0300, Nikolay Aleksandrov wrote:
> On 20/08/2021 20:06, Vladimir Oltean wrote:
> > On Fri, Aug 20, 2021 at 07:09:18PM +0300, Ido Schimmel wrote:
> >> On Fri, Aug 20, 2021 at 12:37:23PM +0300, Vladimir Oltean wrote:
> >>> On Fri, Aug 20, 2021 at 12:16:10PM +0300, Ido Schimmel wrote:
> >>>> On Thu, Aug 19, 2021 at 07:07:18PM +0300, Vladimir Oltean wrote:
> >>>>> Problem statement:
> >>>>>
> >>>>> Any time a driver needs to create a private association between a bridge
> >>>>> upper interface and use that association within its
> >>>>> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> >>>>> entries deleted by the bridge when the port leaves. The issue is that
> >>>>> all switchdev drivers schedule a work item to have sleepable context,
> >>>>> and that work item can be actually scheduled after the port has left the
> >>>>> bridge, which means the association might have already been broken by
> >>>>> the time the scheduled FDB work item attempts to use it.
> >>>>
> >>>> This is handled in mlxsw by telling the device to flush the FDB entries
> >>>> pointing to the {port, FID} when the VLAN is deleted (synchronously).
> >>>
> >>> Again, central solution vs mlxsw solution.
> >>
> >> Again, a solution is forced on everyone regardless if it benefits them
> >> or not. List is bombarded with version after version until patches are
> >> applied. *EXHAUSTING*.
> > 
> > So if I replace "bombarded" with a more neutral word, isn't that how
> > it's done though? What would you do if you wanted to achieve something
> > but the framework stood in your way? Would you work around it to avoid
> > bombarding the list?
> > 
> >> With these patches, except DSA, everyone gets another queue_work() for
> >> each FDB entry. In some cases, it completely misses the purpose of the
> >> patchset.
> > 
> > I also fail to see the point. Patch 3 will have to make things worse
> > before they get better. It is like that in DSA too, and made more
> > reasonable only in the last patch from the series.
> > 
> > If I saw any middle-ground way, like keeping the notifiers on the atomic
> > chain for unconverted drivers, I would have done it. But what do you do
> > if more than one driver listens for one event, one driver wants it
> > blocking, the other wants it atomic. Do you make the bridge emit it
> > twice? That's even worse than having one useless queue_work() in some
> > drivers.
> > 
> > So if you think I can avoid that please tell me how.
> > 
> 
> Hi,
> I don't like the double-queuing for each fdb for everyone either, it's forcing them
> to rework it asap due to inefficiency even though that shouldn't be necessary. In the
> long run I hope everyone would migrate to such scheme, but perhaps we can do it gradually.

The fundamental problem is that these operations need to be deferred in
the first place. It would have been much better if user space could get
a synchronous feedback.

It all stems from the fact that control plane operations need to be done
under a spin lock because the shared databases (e.g., FDB, MDB) or
states (e.g., STP) that they are updating can also be updated from the
data plane in softIRQ.

I don't have a clean solution to this problem without doing a surgery in
the bridge driver. Deferring updates from the data plane using a work
queue and converting the spin locks to mutexes. This will also allow us
to emit netlink notifications from process context and convert
GFP_ATOMIC to GFP_KERNEL.

Is that something you consider as acceptable? Does anybody have a better
idea?

> For most drivers this is introducing more work (as in processing) rather than helping
> them right now, give them the option to convert to it on their own accord or bite
> the bullet and convert everyone so the change won't affect them, it holds rtnl, it is blocking
> I don't see why not convert everyone to just execute their otherwise queued work.
> I'm sure driver maintainers would appreciate such help and would test and review it. You're
> halfway there already..
> 
> Cheers,
>  Nik
> 
> 
> 
> 
> 
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ