lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04b06966-ad5b-46ce-a629-b6de7b428360@oracle.com>
Date: Mon, 27 Jan 2025 08:28:56 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Li Lingfeng <lilingfeng3@...wei.com>, jlayton@...nel.org, neilb@...e.de,
        okorniev@...hat.com, kolga@...app.com, Dai.Ngo@...cle.com,
        tom@...pey.com, trondmy@...merspace.com, linux-nfs@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc: yukuai1@...weicloud.com, houtao1@...wei.com, yi.zhang@...wei.com,
        yangerkun@...wei.com, lilingfeng@...weicloud.com
Subject: Re: [PATCH 1/2] nfsd: map the ELOOP to nfserr_symlink to avoid
 warning

On 1/26/25 9:33 PM, Li Lingfeng wrote:
> 
> 在 2025/1/27 1:27, Chuck Lever 写道:
>> On 1/26/25 4:50 AM, Li Lingfeng wrote:
>>> We got -ELOOP from ext4, resulting in the following WARNING:
>>>
>>> VFS: Lookup of 'dc' in ext4 sdd would have caused loop
>>> ------------[ cut here ]------------
>>> nfsd: non-standard errno: -40
>>> WARNING: CPU: 1 PID: 297024 at fs/nfsd/vfs.c:113 nfserrno+0xc8/0x128
>>> Modules linked in:
>>> CPU: 1 PID: 297024 Comm: nfsd Not tainted 6.6.0-gfa4c2159cd0d-dirty #21
>>> Hardware name: linux,dummy-virt (DT)
>>> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> pc : nfserrno+0xc8/0x128
>>> lr : nfserrno+0xc8/0x128
>>> sp : ffff8000846475a0
>>> x29: ffff8000846475a0 x28: 0000000000000130 x27: ffff0000d65a24e8
>>> x26: ffff0000c7319134 x25: ffff0000d6de4240 x24: 0000000000000002
>>> x23: ffffcda9eaac3080 x22: 00000000ffffffd8 x21: 0000000000000026
>>> x20: ffffcda9ee055000 x19: 0000000000000000 x18: 0000000000000000
>>> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> x14: 0000000000000000 x13: 0000000000000001 x12: ffff60001b5ca39b
>>> x11: 1fffe0001b5ca39a x10: ffff60001b5ca39a x9 : dfff800000000000
>>> x8 : 00009fffe4a35c66 x7 : ffff0000dae51cd3 x6 : 0000000000000001
>>> x5 : ffff0000dae51cd0 x4 : ffff60001b5ca39b x3 : dfff800000000000
>>> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000ca5d8040
>>> Call trace:
>>>   nfserrno+0xc8/0x128
>>>   nfsd4_encode_dirent_fattr+0x358/0x380
>>>   nfsd4_encode_dirent+0x164/0x3a8
>>>   nfsd_buffered_readdir+0x1a8/0x3a0
>>>   nfsd_readdir+0x14c/0x188
>>>   nfsd4_encode_readdir+0x1d4/0x370
>>>   nfsd4_encode_operation+0x130/0x518
>>>   nfsd4_proc_compound+0x394/0xec0
>>>   nfsd_dispatch+0x264/0x418
>>>   svc_process_common+0x584/0xc78
>>>   svc_process+0x1e8/0x2c0
>>>   svc_recv+0x194/0x2d0
>>>   nfsd+0x198/0x378
>>>   kthread+0x1d8/0x1f0
>>>   ret_from_fork+0x10/0x20
>>> Kernel panic - not syncing: kernel: panic_on_warn set ...
>>>
>>> The ELOOP error in Linux indicates that too many symbolic links were
>>> encountered in resolving a path name. Mapping it to nfserr_symlink 
>>> may be
>>> fine.
>>>
>>> Signed-off-by: Li Lingfeng <lilingfeng3@...wei.com>
>>> ---
>>>   fs/nfsd/vfs.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>> index 29cb7b812d71..0f727010b8cb 100644
>>> --- a/fs/nfsd/vfs.c
>>> +++ b/fs/nfsd/vfs.c
>>> @@ -100,6 +100,7 @@ nfserrno (int errno)
>>>           { nfserr_perm, -ENOKEY },
>>>           { nfserr_no_grace, -ENOGRACE},
>>>           { nfserr_io, -EBADMSG },
>>> +        { nfserr_symlink, -ELOOP },
>>>       };
>>>       int    i;
>>
>> Adding ELOOP -> SYMLINK as a generic mapping could be a problem.
>>
>> RFC 8881 Section 15.2 does not list NFS4ERR_SYMLINK as a permissible
>> status code for NFSv4 READDIR. Further, Section 15.4 lists only the
>> following operations for NFS4ERR_SYMLINK:
>>
>> COMMIT, LAYOUTCOMMIT, LINK, LOCK, LOCKT, LOOKUP, LOOKUPP, OPEN, READ, 
>> WRITE
>>
>>
>> Which of lookup_positive_unlocked() or nfsd_cross_mnt() returned
>> ELOOP, and why? What were the export options? What was in the file
>> system that caused this? Can this scenario be reproduced on v6.13?
>>
> Hi,
> I got a more detailed log with line numbers from our test team.
> 
> VFS: Lookup of 'dc' in ext4 sdd would have caused loop
> ------------[ cut here ]------------
> nfsd: non-standard errno: -40
> WARNING: CPU: 1 PID: 297024 at fs/nfsd/vfs.c:113 nfserrno fs/nfsd/ 
> vfs.c:113 [inline]
> WARNING: CPU: 1 PID: 297024 at fs/nfsd/vfs.c:113 nfserrno+0xc8/0x128 fs/ 
> nfsd/vfs.c:61
> Modules linked in:
> CPU: 1 PID: 297024 Comm: nfsd Not tainted 6.6.0-gfa4c2159cd0d-dirty #21
> Hardware name: linux,dummy-virt (DT)
> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : nfserrno fs/nfsd/vfs.c:113 [inline]
> pc : nfserrno+0xc8/0x128 fs/nfsd/vfs.c:61
> lr : nfserrno fs/nfsd/vfs.c:113 [inline]
> lr : nfserrno+0xc8/0x128 fs/nfsd/vfs.c:61
> sp : ffff8000846475a0
> x29: ffff8000846475a0 x28: 0000000000000130 x27: ffff0000d65a24e8
> x26: ffff0000c7319134 x25: ffff0000d6de4240 x24: 0000000000000002
> x23: ffffcda9eaac3080 x22: 00000000ffffffd8 x21: 0000000000000026
> x20: ffffcda9ee055000 x19: 0000000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000001 x12: ffff60001b5ca39b
> x11: 1fffe0001b5ca39a x10: ffff60001b5ca39a x9 : dfff800000000000
> x8 : 00009fffe4a35c66 x7 : ffff0000dae51cd3 x6 : 0000000000000001
> x5 : ffff0000dae51cd0 x4 : ffff60001b5ca39b x3 : dfff800000000000
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000ca5d8040
> Call trace:
>   nfserrno fs/nfsd/vfs.c:113 [inline]
>   nfserrno+0xc8/0x128 fs/nfsd/vfs.c:61
>   nfsd4_encode_dirent_fattr+0x358/0x380 fs/nfsd/nfs4xdr.c:3536
>   nfsd4_encode_dirent+0x164/0x3a8 fs/nfsd/nfs4xdr.c:3633
>   nfsd_buffered_readdir+0x1a8/0x3a0 fs/nfsd/vfs.c:2067
>   nfsd_readdir+0x14c/0x188 fs/nfsd/vfs.c:2123
>   nfsd4_encode_readdir+0x1d4/0x370 fs/nfsd/nfs4xdr.c:4273
>   nfsd4_encode_operation+0x130/0x518 fs/nfsd/nfs4xdr.c:5399
>   nfsd4_proc_compound+0x394/0xec0 fs/nfsd/nfs4proc.c:2753
>   nfsd_dispatch+0x264/0x418 fs/nfsd/nfssvc.c:1011
>   svc_process_common+0x584/0xc78 net/sunrpc/svc.c:1396
>   svc_process+0x1e8/0x2c0 net/sunrpc/svc.c:1542
>   svc_recv+0x194/0x2d0 net/sunrpc/svc_xprt.c:877
>   nfsd+0x198/0x378 fs/nfsd/nfssvc.c:955
>   kthread+0x1d8/0x1f0 kernel/kthread.c:388
>   ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:861
> 
> Although I don't have a reproducer to reproduce this problem, I think
> ELOOP should be returned by the following path:
> 
> v6.6
> nfsd4_encode_readdir
>   nfsd_readdir
>    nfsd_buffered_readdir
>     nfsd4_encode_dirent // func
>      nfsd4_encode_dirent_fattr
>       nfsd4_encode_dirent_fattr
>        lookup_positive_unlocked
>         lookup_one_positive_unlocked
>          lookup_one_unlocked // ELOOP
>           lookup_slow
>            __lookup_slow
>             ext4_lookup // inode->i_op->lookup
>              d_splice_alias
>               // VFS: Lookup of 'dc' in ext4 sdd would have caused loop
> 
> This scenario may be reproduced on v6.13 like this:
> nfsd4_encode_readdir
>   nfsd4_encode_dirlist4
>    nfsd_readdir
>     nfsd_buffered_readdir
>      nfsd4_encode_entry4 // func
>       nfsd4_encode_entry4_fattr
>        lookup_positive_unlocked
>         lookup_one_positive_unlocked
>          lookup_one_unlocked
>           lookup_slow
>            __lookup_slow
>             ext4_lookup // inode->i_op->lookup
>              d_splice_alias

So: lookup_positive_unlocked() is the VFS API returning it. Got it.


> According to the information provided by the test team, the export option
> is "rw,no_root_squash", and I'll try to reproduce the problem.
> 
> By the way, could you suggest which NFS error code would be most
> appropriate to map ELOOP to?

NFS4ERR_SYMLINK is closest. But the spec says, you can't return that
status for every operation; in particular, READDIR does not allow it.
So I'm quite hesitant to correct the crash you found by adding this
mapping to nfserrno.

In this case, I wonder if READDIR can simply not return attributes
when it hits an error.


-- 
Chuck Lever

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ