[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c7136bad-5a00-4224-a25c-0cf7e8252f4a@oracle.com>
Date: Thu, 13 Nov 2025 12:47:23 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: "Tyler W. Ross" <TWR@...erwross.com>,
"1120598@...s.debian.org" <1120598@...s.debian.org>,
Jeff Layton <jlayton@...nel.org>, NeilBrown <neil@...wn.name>,
Scott Mayhew <smayhew@...hat.com>, Steve Dickson <steved@...hat.com>,
Salvatore Bonaccorso <carnil@...ian.org>
Cc: Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, Trond Myklebust <trondmy@...nel.org>,
Anna Schumaker <anna@...nel.org>, linux-nfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: ls input/output error ("NFS: readdir(/) returns -5") on krb5
NFSv4 client using SHA2
On 11/13/25 12:16 PM, Tyler W. Ross wrote:
> Thanks, Chunk.
>
> Suggested trace-cmd report from the client follows. Last 3 lines appear salient, but I've included the full report just in case.
>
> <idle>-0 [001] ..s2. 270.327040: xs_data_ready: peer=[10.108.2.102]:2049
> kworker/u16:0-12 [001] ...1. 270.327048: xprt_lookup_rqst: peer=[10.108.2.102]:2049 xid=0x7b569c7a status=0
> kworker/u16:0-12 [001] ...2. 270.327050: rpc_task_wakeup: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=0x6 status=0 timeout=15000 queue=xprt_pending
> kworker/u16:0-12 [001] ..... 270.327054: xs_stream_read_request: peer=[10.108.2.102]:2049 xid=0x7b569c7a copied=988 reclen=988 offset=988
> kworker/u16:0-12 [001] ..... 270.327055: xs_stream_read_data: peer=[10.108.2.102]:2049 err=-11 total=992
> ls-969 [003] ..... 270.327062: rpc_task_sync_wake: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
> ls-969 [003] ..... 270.327062: rpc_task_run_action: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=xprt_timer
> ls-969 [003] ..... 270.327063: rpc_task_run_action: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
> ls-969 [003] ..... 270.327063: rpc_task_run_action: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_decode
> ls-969 [003] ..... 270.327063: rpc_xdr_recvfrom: task:00000008@...00005 head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] len=988
> ls-969 [003] ..... 270.327067: rpc_xdr_overflow: task:00000008@...00005 nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
Here's the problem. This is a sign of an XDR decoding issue. If you
capture the traffic with Wireshark, does Wireshark indicate where the
XDR is malformed?
If it doesn't, then there is some problem with the client code. Since
Fedora 43 is working as expected, I would guess there's a misapplied
patch on Debian 13's kernel...?
> ls-969 [003] ..... 270.327068: rpc_task_run_action: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=rpc_exit_task
> ls-969 [003] ..... 270.327068: rpc_task_end: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=rpc_exit_task
> ls-969 [003] ..... 270.327068: rpc_stats_latency: task:00000008@...00005 xid=0x7b569c7a nfsv4 READDIR backlog=7 rtt=110 execute=137 xprt_id=1
> ls-969 [003] ..... 270.327068: rpc_task_call_done: task:00000008@...00005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=nfs41_call_sync_done
> ls-969 [003] ..... 270.327068: nfs4_sequence_done: error=0 (OK) session=0x5988ad3c slot_nr=0 seq_nr=26 highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
> ls-969 [003] ...1. 270.327069: xprt_release_xprt: task:00000008@...00005 snd_task:ffffffff
> ls-969 [003] ...1. 270.327070: nfs_set_cache_invalid: error=0 (OK) fileid=00:2d:262146 fhandle=0xad8c294c type=4 (DIR) version=31 size=4096 cache_validity=0x4 (INVALID_ATIME) nfs_flags=0x4 (ACL_LRU_SET)
> ls-969 [003] ..... 270.327070: nfs4_readdir: error=-5 (EIO) fileid=00:2d:262146 fhandle=0xad8c294c
> ls-969 [003] ..... 270.327071: nfs_readdir_cache_fill_done: error=-5 (IO) fileid=00:2d:262146 fhandle=0xad8c294c type=4 (DIR) version=31 size=4096 cache_validity=0x4 (INVALID_ATIME) nfs_flags=0x4 (ACL_LRU_SET)
--
Chuck Lever
Powered by blists - more mailing lists