linux-cve-announce - CVE-2025-38492: netfs: Fix race between cache write completion and ALL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <2025072817-CVE-2025-38492-d59e@gregkh>
Date: Mon, 28 Jul 2025 13:22:34 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...nel.org>
Subject: CVE-2025-38492: netfs: Fix race between cache write completion and ALL_QUEUED being set

From: Greg Kroah-Hartman <gregkh@...nel.org>

Description
===========

In the Linux kernel, the following vulnerability has been resolved:

netfs: Fix race between cache write completion and ALL_QUEUED being set

When netfslib is issuing subrequests, the subrequests start processing
immediately and may complete before we reach the end of the issuing
function.  At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED
to indicate to the collector that we aren't going to issue any more subreqs
and that it can do the final notifications and cleanup.

Now, this isn't a problem if the request is synchronous
(NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be
done in-thread and we're guaranteed an opportunity to run the collector.

However, if the request is asynchronous, collection is primarily triggered
by the termination of subrequests queuing it on a workqueue.  Now, a race
can occur here if the app thread sets ALL_QUEUED after the last subrequest
terminates.

This can happen most easily with the copy2cache code (as used by Ceph)
where, in the collection routine of a read request, an asynchronous write
request is spawned to copy data to the cache.  Folios are added to the
write request as they're unlocked, but there may be a delay before
ALL_QUEUED is set as the write subrequests may complete before we get
there.

If all the write subreqs have finished by the ALL_QUEUED point, no further
events happen and the collection never happens, leaving the request
hanging.

Fix this by queuing the collector after setting ALL_QUEUED.  This is a bit
heavy-handed and it may be sufficient to do it only if there are no extant
subreqs.

Also add a tracepoint to cross-reference both requests in a copy-to-request
operation and add a trace to the netfs_rreq tracepoint to indicate the
setting of ALL_QUEUED.

The Linux kernel CVE team has assigned CVE-2025-38492 to this issue.

Affected and fixed versions
===========================

	Issue introduced in 6.14 with commit e2d46f2ec332533816417b60933954173f602121 and fixed in 6.15.8 with commit 110188a13c4853bd4c342e600ced4dfd26c3feb5
	Issue introduced in 6.14 with commit e2d46f2ec332533816417b60933954173f602121 and fixed in 6.16 with commit 89635eae076cd8eaa5cb752f66538c9dc6c9fdc3

Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.

Unaffected versions might change over time as fixes are backported to
older supported kernel versions.  The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2025-38492
will be updated if fixes are backported, please check that for the most
up to date information about this issue.

Affected files
==============

The file(s) affected by this issue are:
	fs/netfs/read_pgpriv2.c
	include/trace/events/netfs.h

Mitigation
==========

The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes.  Individual
changes are never tested alone, but rather are part of a larger kernel
release.  Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all.  If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
	https://git.kernel.org/stable/c/110188a13c4853bd4c342e600ced4dfd26c3feb5
	https://git.kernel.org/stable/c/89635eae076cd8eaa5cb752f66538c9dc6c9fdc3