[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20211029084527.57d38f09@hermes.local>
Date: Fri, 29 Oct 2021 08:45:27 -0700
From: Stephen Hemminger <stephen@...workplumber.org>
To: Antoine Tenart <atenart@...nel.org>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
gregkh@...uxfoundation.org, ebiederm@...ssion.com,
herbert@...dor.apana.org.au, juri.lelli@...hat.com,
netdev@...r.kernel.org, mhocko@...e.com
Subject: Re: [RFC PATCH net-next 0/9] Userspace spinning on net-sysfs access
On Fri, 29 Oct 2021 16:33:05 +0200
Antoine Tenart <atenart@...nel.org> wrote:
> With the approach taken in this thread not going too well[1], what next?
> I think we should discuss other possibilities and gather some ideas.
> Below are some early thoughts, that might not be acceptable.
>
> 1) Making an rtnl_lock helper that returns when needed
>
> The idea would be to replace rtnl_trylock/restart_syscall by this
> helper, which would try to grab the rtnl lock and return when needed.
> Something like the following:
>
> static rtnl_lock_ifalive(const struct net_device *dev)
> {
> while (!rtnl_trylock()) {
> if (!dev_isalive(dev))
> return -EINVAL;
>
> /* Special case for queue files */
> if (dev->drain_sysfs_queues)
> return restart_syscall();
>
> /* something not to have the CPU spinning */
> }
> }
>
> One way not to have the CPU spinning is to sleep, let's say with
> `usleep_range(500, 1000);` (range to be defined properly). The
> disadvantage is on net device unregistration as we might need to wait
> for all those loops to return first. (It's a trade-off though, we're not
> restarting syscalls over and over in other situations). Or would there
> be something better?
>
> Possible improvements:
> - Add an overall timeout and restart the syscall if we hit it, to have
> an upper bound.
> - Make it interruptible, check for need_resched, etc.
>
> Note that this approach could work for sysctl files as well; looping
> over all devices in a netns to make the checks.
>
> 2) Interrupt rtnl_lock when in the unregistration paths
>
> I'm wondering if using mutex_lock_interruptible in problematic areas
> (sysfs, sysctl), keeping track of their tasks and interrupting them in
> the unregistration paths would work and be acceptable. On paper this
> looks like a solution with not much overhead and not too invasive to
> implement. But is it acceptable? (Are there some side effects we really
> don't want?).
>
> Note that this would need some thinking to make it safe against sysfs
> accesses between the tasks interruption and the sysfs files draining
> (refcount? another lock?).
>
> 3) Other ideas?
>
> Also, I'm not sure about the -rt implications of all the above.
>
> Thanks,
> Antoine
>
> [1] https://lore.kernel.org/netdev/163549826664.3523.4140191764737040064@kwain/
As the originator of net-sysfs and the trylock, I appreciate your attempts
to do something better. The best solution may actually not be just in the
networking code but go all the way back up to restart_syscall() itself.
Spinning a CPU itself is not a bad thing, see spin locks.
The problem is if the spin causes the active CPU to not make progress.
Powered by blists - more mailing lists