[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <53902B2E-10E5-4CC1-B05B-D962D3C69FC5@linuxhacker.ru>
Date: Tue, 5 Jul 2016 14:12:37 -0400
From: Oleg Drokin <green@...uxhacker.ru>
To: Al Viro <viro@...IV.linux.org.uk>
Cc: Mailing List <linux-kernel@...r.kernel.org>,
"<linux-fsdevel@...r.kernel.org>" <linux-fsdevel@...r.kernel.org>
Subject: Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.
On Jul 5, 2016, at 1:42 PM, Al Viro wrote:
> On Tue, Jul 05, 2016 at 11:21:32AM -0400, Oleg Drokin wrote:
>>> ...
>>> - if (d_unhashed(*de)) {
>>> + if (d_in_lookup(*de)) {
>>> struct dentry *alias;
>>>
>>> alias = ll_splice_alias(inode, *de);
>>
>> This breaks Lustre because we now might progress further in this function
>> without calling into ll_splice_alias and that's the only place that we do
>> ll_d_init() that later code depends on so we violently crash next time
>> we call e.g. d_lustre_revalidate() further down that code.
>
> Huh? How the hell do those conditions differ there?
Like explained in my other email, because this is in a normal
lookup path, we can get here with a new dentry that was allocated in
__hash_lookup via d_alloc (not parallel) that's not marked with the PAR bit.
>> Also I still wonder what's to stop d_alloc_parallel() from returning
>> a hashed dentry with d_in_lookup() still true?
>
> The fact that such dentries do not exist at any point?
>
>> Certainly there's a big gap between hashing the dentry and dropping the PAR
>> bit in there that I imagine might allow __d_lookup_rcu() to pick it up
>> in between?--
>
> WTF? Where do you see that gap? in-lookup dentries get hashed only in one
> place - __d_add(). And there (besides holding ->d_lock around both) we
> drop that bit in flags *before* _d_rehash(). AFAICS, the situation with
> barriers is OK there, due to lockref_get_not_dead() serving as ACQUIRE
> operation; I could be missing something subtle, but a wide gap... Where?
Oh! I see, I missed that __d_add drops the PAR bit as well, not just the code
at the end of the call that does d_alloc_parallel.
Then indeed there is no gap, sorry for the false alarm.
Powered by blists - more mailing lists