lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c3b73d4a-12fc-469e-8b54-ba714dbef54e@oracle.com>
Date: Wed, 29 Jan 2025 09:42:39 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
        Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
        Tom Talpey <tom@...pey.com>, "J. Bruce Fields" <bfields@...ldses.org>,
        Kinglong Mee <kinglongmee@...il.com>,
        Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and
 cleanups

On 1/29/25 9:39 AM, Jeff Layton wrote:
> On Wed, 2025-01-29 at 09:22 -0500, Chuck Lever wrote:
>> On 1/29/25 8:39 AM, Jeff Layton wrote:
>>> While looking over the CB_SEQUENCE error handling, I discovered that
>>> callbacks don't hold a reference to a session, and the
>>> clp->cl_cb_session could easily change between request and response.
>>> If that happens at an inopportune time, there could be UAFs or weird
>>> slot/sequence handling problems.
>>>
>>> This series changes the nfsd4_session to be RCU-freed, and then adds a
>>> new method of session refcounting that is compatible with the old.
>>> nfsd4_callback RPCs will now hold a lightweight reference to the session
>>> in addition to the slot. Then, all of the callback handling is switched
>>> to use that session instead of dereferencing clp->cb_cb_session.
>>> I've also reworked the error handling in nfsd4_cb_sequence_done()
>>> based on review comments, and lifted the v4.0 handing out of that
>>> function.
>>>
>>> This passes pynfs, nfstests, and fstests for me, but I'm not sure how
>>> much any of that stresses the backchannel's error handling.
>>>
>>> These should probably go in via Chuck's tree, but the last patch touches
>>> some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
>>> from Trond and/or Anna on that one.
>>
>> A few initial reactions as I get to know this new revision.
>>
>> - I have no objection to 7/7, but it does seem a bit out of place in
>>     this series. Maybe hold it back and send it separately after this
>>     series goes in?
>>
>> - The fact that the session can be replaced while a callback operation
>>     is pending suggests that, IIUC, decode_cb_sequence() sanity checking
>>     will fail in such cases, and it's not because of a bug in the client's
>>     callback server. Or maybe I'm overthinking it - that is exactly what
>>     you are trying to prevent?
>>
> 
> That's the best-case scenario, but callbacks can run at any time. If
> this happens at the wrong time this could crash or cause more subtle
> problems than just a spurious ESERVERFAULT. IOW, we could pass
> decode_cb_sequence(), then the pointer changes and then
> nfsd4_cb_sequence_done() ends up working with a different session.
> 
>> - In 1/7, the kdoc comment for "get" should enumerate the return values
>>     and their meanings.
>>
> 
> Ack
> 
>> - cb_session_changed => nfsd4_cb_session_changed.
>>
> 
> Ack
> 
>> - I'm still not convinced it's wise to bump the slot number in the
>>     ESERVERFAULT case.
>>
> 
> It's debatable for sure. The client _did_ respond with NFS4_OK in this
> case, but it is a bit sketchy since something else didn't match. I'm
> fine with removing that seq bump if you prefer.

I'd remove it: if the session/slot/seq number don't match, the NFS4_OK
is pretty meaningless.

Flag a session fault, and requeue the RPC.


>> - IMO the cb_sequence_done rework should rename "need_restart" to
>>     "need_requeue" or just "requeue" -- there is a call to
>>     rpc_restart_call_prepare() here that is a little confusing and
>>     could do with some disambiguation.
>>
> 
> Good point. I'll change that too.
> 
> Thanks for the review!
> 
>>
>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>> ---
>>> Changes in v2:
>>> - make nfsd4_session be RCU-freed
>>> - change code to keep reference to session over callback RPCs
>>> - rework error handling in nfsd4_cb_sequence_done()
>>> - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>> - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@kernel.org
>>>
>>> ---
>>> Jeff Layton (7):
>>>         nfsd: add routines to get/put session references for callbacks
>>>         nfsd: make clp->cl_cb_session be an RCU managed pointer
>>>         nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
>>>         nfsd: overhaul CB_SEQUENCE error handling
>>>         nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
>>>         nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
>>>         sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
>>>
>>>    fs/nfs/nfs4proc.c           |  12 ++-
>>>    fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
>>>    fs/nfsd/nfs4state.c         |  43 ++++++++-
>>>    fs/nfsd/state.h             |   6 +-
>>>    fs/nfsd/trace.h             |   6 +-
>>>    include/linux/sunrpc/clnt.h |   4 +-
>>>    net/sunrpc/clnt.c           |   7 +-
>>>    7 files changed, 210 insertions(+), 80 deletions(-)
>>> ---
>>> base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
>>> change-id: 20250123-nfsd-6-14-b0797e385dc0
>>>
>>> Best regards,
>>
>>
> 


-- 
Chuck Lever

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ