[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fbfce8d7-5bd8-723a-8ab3-0ce5bc6b073a@nvidia.com>
Date: Sun, 22 Aug 2021 12:12:02 +0300
From: Nikolay Aleksandrov <nikolay@...dia.com>
To: Ido Schimmel <idosch@...sch.org>
Cc: Vladimir Oltean <olteanv@...il.com>,
Vladimir Oltean <vladimir.oltean@....com>,
netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Roopa Prabhu <roopa@...dia.com>, Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Vadym Kochan <vkochan@...vell.com>,
Taras Chornyi <tchornyi@...vell.com>,
Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
UNGLinuxDriver@...rochip.com,
Grygorii Strashko <grygorii.strashko@...com>,
Marek Behun <kabel@...ckhole.sk>,
DENG Qingfang <dqfext@...il.com>,
Kurt Kanzenbach <kurt@...utronix.de>,
Hauke Mehrtens <hauke@...ke-m.de>,
Woojung Huh <woojung.huh@...rochip.com>,
Sean Wang <sean.wang@...iatek.com>,
Landen Chao <Landen.Chao@...iatek.com>,
Claudiu Manoil <claudiu.manoil@....com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
George McCollister <george.mccollister@...il.com>,
Ioana Ciornei <ioana.ciornei@....com>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>,
Lars Povlsen <lars.povlsen@...rochip.com>,
Steen Hegelund <Steen.Hegelund@...rochip.com>,
Julian Wiedmann <jwi@...ux.ibm.com>,
Karsten Graul <kgraul@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Ivan Vecera <ivecera@...hat.com>,
Vlad Buslov <vladbu@...dia.com>,
Jianbo Liu <jianbol@...dia.com>,
Mark Bloch <mbloch@...dia.com>, Roi Dayan <roid@...dia.com>,
Tobias Waldekranz <tobias@...dekranz.com>,
Vignesh Raghavendra <vigneshr@...com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>
Subject: Re: [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
blocking
On 22/08/2021 09:48, Ido Schimmel wrote:
> On Sat, Aug 21, 2021 at 02:36:26AM +0300, Nikolay Aleksandrov wrote:
>> On 20/08/2021 20:06, Vladimir Oltean wrote:
>>> On Fri, Aug 20, 2021 at 07:09:18PM +0300, Ido Schimmel wrote:
>>>> On Fri, Aug 20, 2021 at 12:37:23PM +0300, Vladimir Oltean wrote:
>>>>> On Fri, Aug 20, 2021 at 12:16:10PM +0300, Ido Schimmel wrote:
>>>>>> On Thu, Aug 19, 2021 at 07:07:18PM +0300, Vladimir Oltean wrote:
>>>>>>> Problem statement:
>>>>>>>
>>>>>>> Any time a driver needs to create a private association between a bridge
>>>>>>> upper interface and use that association within its
>>>>>>> SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
>>>>>>> entries deleted by the bridge when the port leaves. The issue is that
>>>>>>> all switchdev drivers schedule a work item to have sleepable context,
>>>>>>> and that work item can be actually scheduled after the port has left the
>>>>>>> bridge, which means the association might have already been broken by
>>>>>>> the time the scheduled FDB work item attempts to use it.
>>>>>>
>>>>>> This is handled in mlxsw by telling the device to flush the FDB entries
>>>>>> pointing to the {port, FID} when the VLAN is deleted (synchronously).
>>>>>
>>>>> Again, central solution vs mlxsw solution.
>>>>
>>>> Again, a solution is forced on everyone regardless if it benefits them
>>>> or not. List is bombarded with version after version until patches are
>>>> applied. *EXHAUSTING*.
>>>
>>> So if I replace "bombarded" with a more neutral word, isn't that how
>>> it's done though? What would you do if you wanted to achieve something
>>> but the framework stood in your way? Would you work around it to avoid
>>> bombarding the list?
>>>
>>>> With these patches, except DSA, everyone gets another queue_work() for
>>>> each FDB entry. In some cases, it completely misses the purpose of the
>>>> patchset.
>>>
>>> I also fail to see the point. Patch 3 will have to make things worse
>>> before they get better. It is like that in DSA too, and made more
>>> reasonable only in the last patch from the series.
>>>
>>> If I saw any middle-ground way, like keeping the notifiers on the atomic
>>> chain for unconverted drivers, I would have done it. But what do you do
>>> if more than one driver listens for one event, one driver wants it
>>> blocking, the other wants it atomic. Do you make the bridge emit it
>>> twice? That's even worse than having one useless queue_work() in some
>>> drivers.
>>>
>>> So if you think I can avoid that please tell me how.
>>>
>>
>> Hi,
>> I don't like the double-queuing for each fdb for everyone either, it's forcing them
>> to rework it asap due to inefficiency even though that shouldn't be necessary. In the
>> long run I hope everyone would migrate to such scheme, but perhaps we can do it gradually.
>
> The fundamental problem is that these operations need to be deferred in
> the first place. It would have been much better if user space could get
> a synchronous feedback.
>
> It all stems from the fact that control plane operations need to be done
> under a spin lock because the shared databases (e.g., FDB, MDB) or
> states (e.g., STP) that they are updating can also be updated from the
> data plane in softIRQ.
>
Right, but changing that, as you've noted below, would require moving
the delaying to the bridge, I'd like to avoid that.
> I don't have a clean solution to this problem without doing a surgery in
> the bridge driver. Deferring updates from the data plane using a work
> queue and converting the spin locks to mutexes. This will also allow us
> to emit netlink notifications from process context and convert
> GFP_ATOMIC to GFP_KERNEL.
>
> Is that something you consider as acceptable? Does anybody have a better
> idea?
>
Moving the delays to the bridge for this purpose does not sound like a good solution,
I'd prefer the delaying to be done by the interested third party as in this case rather
than the bridge. If there's a solution that avoids delaying and doesn't hurt the software
fast-path then of course I'll be ok with that.
>> For most drivers this is introducing more work (as in processing) rather than helping
>> them right now, give them the option to convert to it on their own accord or bite
>> the bullet and convert everyone so the change won't affect them, it holds rtnl, it is blocking
>> I don't see why not convert everyone to just execute their otherwise queued work.
>> I'm sure driver maintainers would appreciate such help and would test and review it. You're
>> halfway there already..
>>
>> Cheers,
>> Nik
>>
>>
>>
>>
>>
>>
>>
>>
Powered by blists - more mailing lists