lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20250603002904.GE179983@mit.edu> Date: Tue, 3 Jun 2025 00:29:04 +0000 From: "Theodore Ts'o" <tytso@....edu> To: Mitta Sai Chaithanya <mittas@...rosoft.com> Cc: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>, Nilesh Awate <Nilesh.Awate@...rosoft.com>, Ganesan Kalyanasundaram <ganesanka@...rosoft.com>, Pawan Sharma <sharmapawan@...rosoft.com> Subject: Re: [EXTERNAL] Re: EXT4/JBD2 Not Fully Released device after unmount of NVMe-oF Block Device On Mon, Jun 02, 2025 at 09:32:18PM +0000, Mitta Sai Chaithanya wrote: > However, after the connection is re-established and the device is > unmounted from all namespaces, I still observe errors from both ext4 > and jb2 when the device is especially disconnected. How do you *know* that you've unmounted the device in all namespaces. I seem to recall that some process (I think one of the systemd daemons, but I could be wrong) was creating a namespace that users were not expecting, resulting in the device staying mounted when the users were not so expecting it. The fact that /proc/fs/ext4/<device_name> still exists means that the kernel (specifically, the VFS layer) doesn't think that the file system can be shut down. As a result, the VFS layer has not called ext4's put_super() and kill_sb() methods. And so yes, I/O activity can still happen, because the file system has not been shutdown. If you still see /proc/fs/ext4/<device_name>, my suggestion would be grep /proc/*/mounts looking to see which processes has a namespace which still has the device mounted. I suspect that you will see that there is some namespace that you weren't aware of that is keeping the ext4 struct super object pinned and alive. > Another point I would like to mention, I am observing JBD2 errors especially after NVMe-oF device has been disconnected and below are the logs. Sure, but that's the effect, not the cause, of the NVME-of device getting ripped down while the file system is still active. Which I am 99.997% sure is because it is still mounted in some namespace. The other 0.003% chance is that there is some refcount problem in the VFS subsytem, and I would suggest that you ask Microsoft's VFS experts, (such as Christain Brauner, who is one of the VFS maintainers) to take a look. I very much doubt it is a kernel bug, though. - Ted
Powered by blists - more mailing lists