[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b77bf39-bc1a-47a1-9a16-14c44c31614f@oracle.com>
Date: Thu, 13 Nov 2025 13:12:30 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: "Tyler W. Ross" <TWR@...erwross.com>
Cc: "1120598@...s.debian.org" <1120598@...s.debian.org>,
Jeff Layton <jlayton@...nel.org>, NeilBrown <neil@...wn.name>,
Scott Mayhew <smayhew@...hat.com>, Steve Dickson <steved@...hat.com>,
Salvatore Bonaccorso <carnil@...ian.org>,
Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, Trond Myklebust <trondmy@...nel.org>,
Anna Schumaker <anna@...nel.org>, linux-nfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: ls input/output error ("NFS: readdir(/) returns -5") on krb5
NFSv4 client using SHA2
On 11/13/25 1:05 PM, Tyler W. Ross wrote:
> On Thursday, November 13th, 2025 at 10:47 AM, Chuck Lever <chuck.lever@...cle.com> wrote:
>
>>> ls-969 [003] ..... 270.327063: rpc_xdr_recvfrom: task:00000008@...00005 head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] len=988
>>> ls-969 [003] ..... 270.327067: rpc_xdr_overflow: task:00000008@...00005 nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
>>
>>
>> Here's the problem. This is a sign of an XDR decoding issue. If you
>> capture the traffic with Wireshark, does Wireshark indicate where the
>> XDR is malformed?
>
> Wireshark appears to decode the READDIR reply without issue. Nothing is obviously marked as malformed, and values all appear sane when spot-checking fields in the decoded packet.
Then I would start looking for differences between the Debian 13 and
Fedora 43 kernel code base under net/sunrpc/ .
Alternatively, "git bisect first, ask questions later" ... :-)
So I didn't find an indication of whether this was sec=krb5, sec=krb5i,
or sec=krb5p. That might narrow down where the code changed.
Also, the xdr_buf might have a page boundary positioned in the middle of
an XDR data item. Knowing which data item is being decoded where the
"overflow" occurs might be helpful (I think adding pr_info() call sites
or trace_printk() will be adequate to gain some better observability).
--
Chuck Lever
Powered by blists - more mailing lists