[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87induxd3u.fsf@notabene.neil.brown.name>
Date: Wed, 29 Nov 2017 09:17:09 +1100
From: NeilBrown <neilb@...e.com>
To: paulmck@...ux.vnet.ibm.com, Florian Weimer <fweimer@...hat.com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Josh Triplett <josh@...htriplett.org>
Subject: Re: [PATCH] VFS: use synchronize_rcu_expedited() in namespace_unlock()
On Mon, Nov 27 2017, Paul E. McKenney wrote:
> On Mon, Nov 27, 2017 at 12:27:04PM +0100, Florian Weimer wrote:
>> On 10/26/2017 02:27 PM, Paul E. McKenney wrote:
>> >But just for completeness, one way to make this work across the board
>> >might be to instead use call_rcu(), with the callback function kicking
>> >off a workqueue handler to do the rest of the unmount. Of course,
>> >in saying that, I am ignoring any mutexes that you might be holding
>> >across this whole thing, and also ignoring any problems that might arise
>> >when returning to userspace with some portion of the unmount operation
>> >still pending. (For example, someone unmounting a filesystem and then
>> >immediately remounting that same filesystem.)
>>
>> You really need to complete all side effects of deallocating a
>> resource before returning to user space. Otherwise, it will never
>> be possible to allocate and deallocate resources in a tight loop
>> because you either get spurious failures because too many
>> unaccounted deallocations are stuck somewhere in the system (and the
>> user can't tell that this is due to a race), or you get an OOM
>> because the user manages to queue up too much state.
>>
>> We already have this problem with RLIMIT_NPROC, where waitpid etc.
>> return before the process is completely gone. On some
>> kernels/configurations, the resulting race is so wide that parallel
>> make no longer works reliable because it runs into fork failures.
>
> Or alternatively, use rcu_barrier() occasionally to wait for all
> preceding deferred deallocations. And there are quite a few other
> ways to take on this problem.
So, supposing we could package up everything that has to happen after
the current synchronize_rcu() and put it in an call_rcu() call back,
then instead of calling synchronize_rcu_expedited() at the end of
namespace_unlock(), we could possibly call call_rcu() there and
rcu_barrier() at the start of namespace_lock().....
That would mean a single unmount would have low impact, but it would
still slow down a sequence of 1000 consecutive unmounts.
Maybe we would only need the rcu_barrier() before select
namespace_lock() calls. I would need to study the code closely to
form an opinion. Interesting idea though.
Hopefully the _expedited() patch will be accepted - I haven't had a
"nak" yet...
thanks,
NeilBrown
Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)
Powered by blists - more mailing lists