lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d0ada96d-1805-44d2-b02c-eff64ec0c7d6@proxmox.com>
Date: Thu, 8 Aug 2024 10:07:24 +0200
From: Christian Ebner <c.ebner@...xmox.com>
To: David Howells <dhowells@...hat.com>, Jeff Layton <jlayton@...nel.org>,
 Steve French <smfrench@...il.com>
Cc: Matthew Wilcox <willy@...radead.org>,
 Marc Dionne <marc.dionne@...istor.com>, Paulo Alcantara <pc@...guebit.com>,
 Shyam Prasad N <sprasad@...rosoft.com>, Tom Talpey <tom@...pey.com>,
 Dominique Martinet <asmadeus@...ewreck.org>,
 Eric Van Hensbergen <ericvh@...nel.org>, Ilya Dryomov <idryomov@...il.com>,
 Christian Brauner <christian@...uner.io>, linux-cachefs@...hat.com,
 linux-afs@...ts.infradead.org, linux-cifs@...r.kernel.org,
 linux-nfs@...r.kernel.org, ceph-devel@...r.kernel.org, v9fs@...ts.linux.dev,
 linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 14/40] netfs: Add iov_iters to (sub)requests to
 describe various buffers

Hi,

recently some of our (Proxmox VE) users report issues with file 
corruptions when accessing contents located on CephFS via the in-kernel 
ceph client [0,1]. According to these reports, our kernels based on 
(Ubuntu) v6.8 do show these corruptions, using the FUSE client or the 
in-kernel ceph client with kernels based on v6.5 does not show these.
Unfortunately the corruption is hard to reproduce.

After a further report of file corruption [2] with a completely 
unrelated code path, we managed to reproduce the corruption for one file 
by sheer chance on one of our ceph test clusters. We were able to narrow 
it down to be possibly an issue with reading of the contents via the 
in-kernel ceph client. Note that we can exclude the file contents itself 
being corrupt, as any not affected kernel version or the FUSE client 
gives the correct contents.

The issue is present also in the current mainline kernel 6.11-rc2.

Bisection with the reproducer points to this commit:

"92b6cc5d: netfs: Add iov_iters to (sub)requests to describe various 
buffers"

Description of the issue:

* file copied from local filesystem to cephfs via:
`cp /tmp/proxmox-backup-server_3.2-1.iso 
/mnt/pve/cephfs/proxmox-backup-server_3.2-1.iso`
* sha256sum on local filesystem:
`1d19698e8f7e769cf0a0dcc7ba0018ef5416c5ec495d5e61313f9c84a4237607 
/tmp/proxmox-backup-server_3.2-1.iso`
* sha256sum on cephfs with kernel up to above commit:
`1d19698e8f7e769cf0a0dcc7ba0018ef5416c5ec495d5e61313f9c84a4237607 
/mnt/pve/cephfs/proxmox-backup-server_3.2-1.iso`
* sha256sum on cephfs with kernel after above commit:
`89ad3620bf7b1e0913b534516cfbe48580efbaec944b79951e2c14e5e551f736 
/mnt/pve/cephfs/proxmox-backup-server_3.2-1.iso`
* removing and/or recopying the file does not change the issue, the 
corrupt checksum remains the same.
* Only this one particular file has been observed to show the issue, for 
others the checksums match.
* Accessing the same file from different clients results in the same 
output: The one with above patch applied do show the incorrect checksum, 
ones without the patch show the correct checksum.
* The issue persists even across reboot of the ceph cluster and/or clients.
* The file is indeed corrupt after reading, as verified by a `cmp -b`.

Does anyone have an idea what could be the cause of this issue, or how 
to further debug this? Happy to provide more information or a dynamic 
debug output if needed.

Best regards,

Chris

[0] https://forum.proxmox.com/threads/78340/post-676129
[1] https://forum.proxmox.com/threads/149249/
[2] https://forum.proxmox.com/threads/151291/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ