[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YQqM8XUyHKVaj1WF@unreal>
Date: Wed, 4 Aug 2021 15:49:53 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Cong Wang <xiyou.wangcong@...il.com>,
David Miller <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
"Cong Wang ." <cong.wang@...edance.com>,
Peilin Ye <peilin.ye@...edance.com>,
Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH net-next] Revert "netdevsim: Add multi-queue support"
On Wed, Aug 04, 2021 at 04:52:47AM -0700, Jakub Kicinski wrote:
> On Wed, 4 Aug 2021 10:14:36 +0300 Leon Romanovsky wrote:
> > On Tue, Aug 03, 2021 at 02:51:24PM -0700, Jakub Kicinski wrote:
> > > On Tue, 3 Aug 2021 14:32:19 -0700 Cong Wang wrote:
> > > > On Tue, Aug 3, 2021 at 2:18 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > <...>
> >
> > > > Please remove all those not covered by upstream tests just to be fair??
> > >
> > > I'd love to remove all test harnesses upstream which are not used by
> > > upstream tests, sure :)
> >
> > Jakub,
> >
> > Something related and unrelated at the same time.
> >
> > I need to get rid of devlink_reload_enable()/_disable() to fix some
> > panics in the devlink reload flow.
> >
> > Such change is relatively easy for the HW drivers, but not so for the
> > netdevism due to attempt to synchronize sysfs with devlink.
> >
> > 200 mutex_lock(&nsim_bus_dev->nsim_bus_reload_lock);
> > 201 devlink_reload_disable(devlink);
> > 202 ret = nsim_dev_port_add(nsim_bus_dev, NSIM_DEV_PORT_TYPE_PF, port_index);
> > 203 devlink_reload_enable(devlink);
> > 204 mutex_unlock(&nsim_bus_dev->nsim_bus_reload_lock);
> >
> > Are these sysfs files declared as UAPI? Or can I update upstream test
> > suite and delete them safely?
>
> You can change netdevsim in whatever way is appropriate.
>
> What's your plan, tho? Jiri changed the spawning from rtnetlink
> to sysfs - may be good to consult with him before typing too much
> code.
It is something preliminary, I have POC code which works but it is far
from the actual patches yet.
The problem is that "devlink reload" in its current form causes us
(mlx5) a lot of grief. We see deadlocks due to combinations of internal
flows with external ones, without going too much in details loops of
module removal together with health recovery and devlink reload doesn't
work properly :).
The same problem exists in all drivers that implement "devlink reload",
mlx5 just most complicated one and looks like most tested either.
My idea (for now) is pretty simple:
1. Move devlink ops callbacks from devlink_alloc phase to devlink_register().
2. Ensure that devlink_register() is the last command in the probe sequence.
3. Delete devlink_reload_enable/disable. It is not needed if proper ops used.
4. Add reference counting to struct devlink to make sure that we
properly account netlink users, so we will be able to drop big devlink_lock.
5. Convert linked list of devlink instances to be xarray. It gives us an
option to work relatively lockless.
....
Every step solves some bug, even first one solves current bug where
devlink reload statistics presented despite devlink_reload_disable().
Of course, we can try to patch devlink with specific fix for specific
bug, but better to make it error prone from the beginning.
So I'm trying to get a sense what can and what can't be done in the netdev.
And netdevsim combination of devlink and sysfs knobs adds challenges. :)
Thanks
Powered by blists - more mailing lists