linux-kernel - Re: [PATCH v3] NFSv4: replace seqcount_t with a rw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <40147FA4-9FD8-4349-B309-751DFCE875A2@primarydata.com>
Date:   Mon, 31 Oct 2016 16:11:02 +0000
From:   Trond Myklebust <trondmy@...marydata.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC:     Schumaker Anna <anna.schumaker@...app.com>,
        List Linux NFS Mailing <linux-nfs@...r.kernel.org>,
        List Linux Kernel Mailing <linux-kernel@...r.kernel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>
Subject: Re: [PATCH v3] NFSv4: replace seqcount_t with a rw_semaphore


> On Oct 31, 2016, at 11:56, Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
> 
> On 2016-10-31 15:30:02 [+0000], Trond Myklebust wrote:
>>> On Oct 31, 2016, at 09:19, Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
>>> The list_for_each_entry() in nfs4_reclaim_open_state:
>>> It seems that this lock protects the ->so_states list among other
>>> atomic_t & flags members. So at the begin of the loop we inc ->count
>>> ensuring that this field is not removed while we use it. So we drop the
>>> ->so_lock loc during the loop it seems. And after nfs4_reclaim_locks()
>>> invocation we nfs4_put_open_state() and grab the ->so_lock again. So if
>>> we were the last user of this struct and we remove it, then the
>>> following list_next_entry() invocation is a use-after-free. Even if we
>>> use list_for_each_entry_safe() there is no guarantee that the following
>>> member is still valid because it might have been removed by someone
>>> invoking nfs4_put_open_state() on it, right?
>>> So there is this.
>>> 
>>> However to address my initial problem I have here a patch :) So it uses
>>> a rw_semaphore which ensures that there is only one writer at a time or
>>> multiple reader. So it should be basically what is happening now plus a
>>> tiny tiny tiny lock plus lockdep coverage. I tried to this myself but I
>>> don't manage to get into this code path at all so I might be doing
>>> something wrong.
>>> 
>>> Could you please check if this patch is working for you and whether my
>>> list_for_each_entry() observation is correct or not?
>>> 
>>> v2…v3: replace the seqlock with a RW semaphore.
>>> 
>> 
>> NACK. That will deadlock. The reason why we use a seqlock there is precisely because we cannot allow ordinary RPC calls to lock out the recovery thread. 
> Hmmm. So this is getting invoked if I reboot the server? A restart of
> nfs-kernel-server is the same thing?
> Is the list_for_each_entry() observation I made correct?

Yes, and yes. We can’t rely on the list pointers remaining correct, so we restart the list scan and we use the ops->state_flag_bit to signal whether or not state has been recovered for the entry being scanned.

> 
>> If the server reboots, then ordinary RPC calls will fail until the recovery thread has had a chance to re-establish the state.
> 
> This means that the ordinary RPC call won't return and fail but wait
> with the lock held until the recovery thread did its thing?

It uses the seqcount_t to figure out if a recovery occurred while an OPEN or CLOSE was being processed. If so, it schedules a new recovery of the stateids in question.