[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <45a64f2e-d095-c931-d0c7-23f50e791901@oracle.com>
Date: Thu, 3 Aug 2023 10:01:24 -0700
From: dai.ngo@...cle.com
To: Jeff Layton <jlayton@...nel.org>,
Chuck Lever III <chuck.lever@...cle.com>
Cc: Neil Brown <neilb@...e.de>, Olga Kornievskaia <kolga@...app.com>,
Tom Talpey <tom@...pey.com>,
Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] nfsd: don't hand out write delegations on O_WRONLY
opens
On 8/3/23 4:27 AM, Jeff Layton wrote:
> On Wed, 2023-08-02 at 16:38 -0700, dai.ngo@...cle.com wrote:
>>
>>
>>
>>
>>
>>
>>
>> On 8/2/23 2:52 PM, Jeff Layton wrote:
>>
>>
>>
>>
>>>
>>>
>>> On Wed, 2023-08-02 at 14:32 -0700, dai.ngo@...cle.com wrote:
>>>
>>>
>>>>
>>>>
>>>> On 8/2/23 2:22 PM, dai.ngo@...cle.com wrote:
>>>>
>>>>
>>>>>
>>>>>
>>>>> On 8/2/23 1:57 PM, Chuck Lever III wrote:
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@...nel.org> wrote:
>>>>>>>
>>>>>>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@...cle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/2/23 11:15 AM, Jeff Layton wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@...cle.com wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I noticed that xfstests generic/001 was failing against
>>>>>>>>>>> linux-next nfsd.
>>>>>>>>>>>
>>>>>>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and
>>>>>>>>>>> the server
>>>>>>>>>>> would hand out a write delegation. The client would then try to
>>>>>>>>>>> use that
>>>>>>>>>>> write delegation as the source stateid in a COPY
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> not sure why the client opens the source file of a COPY operation
>>>>>>>>>> with
>>>>>>>>>> OPEN4_SHARE_ACCESS_WRITE?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It doesn't. The original open is to write the data for the file being
>>>>>>>>> copied. It then opens the file again for READ, but since it has a
>>>>>>>>> write
>>>>>>>>> delegation, it doesn't need to talk to the server at all -- it can
>>>>>>>>> just
>>>>>>>>> use that stateid for later operations.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> or CLONE operation, and
>>>>>>>>>>> the server would respond with NFS4ERR_STALE.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If the server does not allow client to use write delegation for the
>>>>>>>>>> READ, should the correct error return be NFS4ERR_OPENMODE?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The server must allow the client to use a write delegation for read
>>>>>>>>> operations. It's required by the spec, AFAIU.
>>>>>>>>>
>>>>>>>>> The error in this case was just bogus. The vfs copy routine would
>>>>>>>>> return
>>>>>>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server
>>>>>>>>> would
>>>>>>>>> translate that into NFS4ERR_STALE.
>>>>>>>>>
>>>>>>>>> Probably there is a better v4 error code that we could translate
>>>>>>>>> EBADF
>>>>>>>>> to, but with this patch it shouldn't be a problem any longer.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The problem is that the struct file associated with the
>>>>>>>>>>> delegation does
>>>>>>>>>>> not necessarily have read permissions. It's handing out a write
>>>>>>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881
>>>>>>>>>>> states:
>>>>>>>>>>>
>>>>>>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to
>>>>>>>>>>> handle, on its
>>>>>>>>>>> own, all opens."
>>>>>>>>>>>
>>>>>>>>>>> Given that the client didn't request any read permissions, and
>>>>>>>>>>> that nfsd
>>>>>>>>>>> didn't check for any, it seems wrong to give out a write
>>>>>>>>>>> delegation.
>>>>>>>>>>>
>>>>>>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor
>>>>>>>>>>> available. If it fails to find an appropriate write descriptor, go
>>>>>>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was
>>>>>>>>>>> requested.
>>>>>>>>>>>
>>>>>>>>>>> This fixes xfstest generic/001.
>>>>>>>>>>>
>>>>>>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412
>>>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>>>>>>>>>> ---
>>>>>>>>>>> Changes in v2:
>>>>>>>>>>> - Rework the logic when finding struct file for the delegation. The
>>>>>>>>>>> earlier patch might still have attached a O_WRONLY file to
>>>>>>>>>>> the deleg
>>>>>>>>>>> in some cases, and could still have handed out a write
>>>>>>>>>>> delegation on
>>>>>>>>>>> an O_WRONLY OPEN request in some cases.
>>>>>>>>>>> ---
>>>>>>>>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++-----------
>>>>>>>>>>> 1 file changed, 18 insertions(+), 11 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>>>>>>>>> index ef7118ebee00..e79d82fd05e7 100644
>>>>>>>>>>> --- a/fs/nfsd/nfs4state.c
>>>>>>>>>>> +++ b/fs/nfsd/nfs4state.c
>>>>>>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open
>>>>>>>>>>> *open, struct nfs4_ol_stateid *stp,
>>>>>>>>>>> struct nfs4_file *fp = stp->st_stid.sc_file;
>>>>>>>>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate;
>>>>>>>>>>> struct nfs4_delegation *dp;
>>>>>>>>>>> - struct nfsd_file *nf;
>>>>>>>>>>> + struct nfsd_file *nf = NULL;
>>>>>>>>>>> struct file_lock *fl;
>>>>>>>>>>> u32 dl_type;
>>>>>>>>>>>
>>>>>>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open
>>>>>>>>>>> *open, struct nfs4_ol_stateid *stp,
>>>>>>>>>>> if (fp->fi_had_conflict)
>>>>>>>>>>> return ERR_PTR(-EAGAIN);
>>>>>>>>>>>
>>>>>>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
>>>>>>>>>>> - nf = find_writeable_file(fp);
>>>>>>>>>>> + /*
>>>>>>>>>>> + * Try for a write delegation first. We need an O_RDWR file
>>>>>>>>>>> + * since a write delegation allows the client to perform any open
>>>>>>>>>>> + * from its cache.
>>>>>>>>>>> + */
>>>>>>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) ==
>>>>>>>>>>> NFS4_SHARE_ACCESS_BOTH) {
>>>>>>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]);
>>>>>>>>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE;
>>>>>>>>>>> - } else {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write
>>>>>>>>>> delegation?
>>>>>>>>>> It does not seem right.
>>>>>>>>>>
>>>>>>>>>> -Dai
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Why? Per RFC 8881:
>>>>>>>>>
>>>>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on
>>>>>>>>> its
>>>>>>>>> own, all opens."
>>>>>>>>>
>>>>>>>>> All opens. That includes read opens.
>>>>>>>>>
>>>>>>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the
>>>>>>>>> user has no read permissions. Therefore, we can't grant a write
>>>>>>>>> delegation since can't guarantee that the user is allowed to do that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> If the server grants the write delegation on an OPEN with
>>>>>>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then
>>>>>>>> why can't the server checks and denies the subsequent READ?
>>>>>>>>
>>>>>>>> Per RFC 8881, section 9.1.2:
>>>>>>>>
>>>>>>>> For delegation stateids, the access mode is based on the type of
>>>>>>>> delegation.
>>>>>>>>
>>>>>>>> When a READ, WRITE, or SETATTR (that specifies the size
>>>>>>>> attribute)
>>>>>>>> operation is done, the operation is subject to checking
>>>>>>>> against the
>>>>>>>> access mode to verify that the operation is appropriate given the
>>>>>>>> stateid with which the operation is associated.
>>>>>>>>
>>>>>>>> In the case of WRITE-type operations (i.e., WRITEs and
>>>>>>>> SETATTRs that
>>>>>>>> set size), the server MUST verify that the access mode allows
>>>>>>>> writing
>>>>>>>> and MUST return an NFS4ERR_OPENMODE error if it does not. In
>>>>>>>> the case
>>>>>>>> of READ, the server may perform the corresponding check on the
>>>>>>>> access
>>>>>>>> mode, or it may choose to allow READ on OPENs for
>>>>>>>> OPEN4_SHARE_ACCESS_WRITE,
>>>>>>>> to accommodate clients whose WRITE implementation may
>>>>>>>> unavoidably do
>>>>>>>> reads (e.g., due to buffer cache constraints). However, even
>>>>>>>> if READs
>>>>>>>> are allowed in these circumstances, the server MUST still
>>>>>>>> check for
>>>>>>>> locks that conflict with the READ (e.g., another OPEN specified
>>>>>>>> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a
>>>>>>>> server
>>>>>>>> that does enforce the access mode check on READs need not
>>>>>>>> explicitly
>>>>>>>> check for conflicting share reservations since the existence
>>>>>>>> of OPEN
>>>>>>>> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share
>>>>>>>> reservation can exist.
>>>>>>>>
>>>>>>>> FWIW, The Solaris server grants write delegation on OPEN with
>>>>>>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or
>>>>>>>> WR-only. Maybe this is a bug? or the spec is not clear?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I don't think that's necessarily a bug.
>>>>>>>
>>>>>>> It's not that the spec demands that we only hand out delegations on
>>>>>>> BOTH
>>>>>>> opens. This is more of a quirk of the Linux implementation. Linux'
>>>>>>> write delegations require an open O_RDWR file descriptor because we may
>>>>>>> be called upon to do a read on its behalf.
>>>>>>>
>>>>>>> Technically, we could probably just have it check for
>>>>>>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set,
>>>>>>> then you're unlikely to get a delegation. Either the O_RDWR descriptor
>>>>>>> will be NULL, or there are other, conflicting opens already present.
>>>>>>>
>>>>>>> Solaris may have a completely different design that doesn't require
>>>>>>> this. I haven't looked at its code to know.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm comfortable for now with not handing out write delegations for
>>>>>> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on
>>>>>> every READ operation.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH
>>>>> only.
>>>>>
>>>>> Just a concern about not checking for access at the time of READ
>>>>> operation.
>>>>>
>>>>>
>>>>
>>>>
>>>> or not checking file permission at the time WRITE.
>>>>
>>>>
>>>>>
>>>>>
>>>>> If the file was opened with SHARE_ACCESS_WRITE (no write delegation
>>>>> granted)
>>>>> and the file access mode was changed to read-only, is it a correct
>>>>> behavior
>>>>> for the server to allow the READ to go through?
>>>>>
>>>>>
>>>>
>>>>
>>>> I meant for the WRITE to go through.
>>>>
>>>>
>>>
>>>
>>> Yes:
>>>
>>> POSIX permissions enforcement is done at open time, not when doing
>>> actual reads and writes. If you open a file on (e.g.) xfs and start
>>> streaming writes to it, then you don't expect that you will lose the
>>> ability to write to that fd if the permissions change.
>>>
>>> In the old v2/3 days of stateless NFS, we had to check permissions on
>>> every READ or WRITE operation, but we generally did an open on every RPC
>>> too, so it just worked out that we checked permissions on each
>>> operation.
>>>
>>> With v4 we can better approximate POSIX semantics by just associating a
>>> stateid with an open file to allow the client to keep writing in this
>>> case.
>>>
>>>
>>
>>
>> Thanks Jeff,
> Don't thank me yet. I went back and looked at the code, and it looks
> like we still do check permissions on every READ/WRITE (see
> nfs4_check_file).
>
> I'm unclear on whether that's required, but it's probably safest to
> always check permissions like we are. That does mean that if the mode of
> the file changes after we open it we could end up being unable to read
> or write to it (much like with v2/3), but at this point most people are
> used to that sort of behavior on NFS, so I don't worry about it too
> much.
It might not conform to Posix permissions enforcement but I like what
the server is doing right now, correctness of permissions enforcement
and consistent behavior of v2/3/4.
-Dai
Powered by blists - more mailing lists