[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40147FA4-9FD8-4349-B309-751DFCE875A2@primarydata.com>
Date: Mon, 31 Oct 2016 16:11:02 +0000
From: Trond Myklebust <trondmy@...marydata.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: Schumaker Anna <anna.schumaker@...app.com>,
List Linux NFS Mailing <linux-nfs@...r.kernel.org>,
List Linux Kernel Mailing <linux-kernel@...r.kernel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>
Subject: Re: [PATCH v3] NFSv4: replace seqcount_t with a rw_semaphore
> On Oct 31, 2016, at 11:56, Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
>
> On 2016-10-31 15:30:02 [+0000], Trond Myklebust wrote:
>>> On Oct 31, 2016, at 09:19, Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
>>> The list_for_each_entry() in nfs4_reclaim_open_state:
>>> It seems that this lock protects the ->so_states list among other
>>> atomic_t & flags members. So at the begin of the loop we inc ->count
>>> ensuring that this field is not removed while we use it. So we drop the
>>> ->so_lock loc during the loop it seems. And after nfs4_reclaim_locks()
>>> invocation we nfs4_put_open_state() and grab the ->so_lock again. So if
>>> we were the last user of this struct and we remove it, then the
>>> following list_next_entry() invocation is a use-after-free. Even if we
>>> use list_for_each_entry_safe() there is no guarantee that the following
>>> member is still valid because it might have been removed by someone
>>> invoking nfs4_put_open_state() on it, right?
>>> So there is this.
>>>
>>> However to address my initial problem I have here a patch :) So it uses
>>> a rw_semaphore which ensures that there is only one writer at a time or
>>> multiple reader. So it should be basically what is happening now plus a
>>> tiny tiny tiny lock plus lockdep coverage. I tried to this myself but I
>>> don't manage to get into this code path at all so I might be doing
>>> something wrong.
>>>
>>> Could you please check if this patch is working for you and whether my
>>> list_for_each_entry() observation is correct or not?
>>>
>>> v2…v3: replace the seqlock with a RW semaphore.
>>>
>>
>> NACK. That will deadlock. The reason why we use a seqlock there is precisely because we cannot allow ordinary RPC calls to lock out the recovery thread.
> Hmmm. So this is getting invoked if I reboot the server? A restart of
> nfs-kernel-server is the same thing?
> Is the list_for_each_entry() observation I made correct?
Yes, and yes. We can’t rely on the list pointers remaining correct, so we restart the list scan and we use the ops->state_flag_bit to signal whether or not state has been recovered for the entry being scanned.
>
>> If the server reboots, then ordinary RPC calls will fail until the recovery thread has had a chance to re-establish the state.
>
> This means that the ordinary RPC call won't return and fail but wait
> with the lock held until the recovery thread did its thing?
It uses the seqcount_t to figure out if a recovery occurred while an OPEN or CLOSE was being processed. If so, it schedules a new recovery of the stateids in question.
Powered by blists - more mailing lists