[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260126023055.405401-1-CFSworks@gmail.com>
Date: Sun, 25 Jan 2026 18:30:51 -0800
From: Sam Edwards <cfsworks@...il.com>
To: Xiubo Li <xiubli@...hat.com>,
Ilya Dryomov <idryomov@...il.com>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@....com>,
Christian Brauner <brauner@...nel.org>,
Milind Changire <mchangir@...hat.com>,
Jeff Layton <jlayton@...nel.org>,
ceph-devel@...r.kernel.org,
linux-kernel@...r.kernel.org,
Sam Edwards <CFSworks@...il.com>
Subject: [PATCH v3 0/4] ceph: CephFS writeback correctness and performance fixes
Hello list,
This is v2 of my series that addresses interrelated issues in CephFS writeback,
fixing crashes, improving robustness, and correcting performance behavior,
particularly for fscrypted files. [1]
Changes v2->v3:
- Split out two patches ("ceph: free page array when ceph_submit_write() fails"
and "ceph: split out page-array discarding to a function") to a new series
[2] since they are independent and had no outstanding review comments.
- Lowercase the subject lines of commit messages, per subsystem-local style.
- Update the commit message of ("ceph: fix write storm on fscrypted files") to
mention the explicit dependency on ("ceph: do not propagate page array
emplacement errors as batch errors") for correctness, to prevent the former
from being accidentally backported without the latter.
- Reorder the series to make the aforementioned patches consecutive. The series
cadence is now: bugfix, bugfix, cleanup, cleanup
- Add a clarification to ("ceph: remove error return from
ceph_process_folio_batch()") that "abort" logic is still possible, just that
it is responsible for cleaning up after itself.
Changes v1->v2:
- Clarify patch #1's commit message to establish that failures on the first
folio are not possible.
- Add another patch to move the "clean up page array on abort" logic to a new
ceph_discard_page_array() function. (Thanks Slava!)
- Change the wording "grossly degraded performance" to instead read
"correspondingly degraded performance." This makes the causal relationship
clearer (that write throughput is limited much more significantly by write
op/s due to the bug) without making any claims (qualitative or otherwise)
about significance. (Thanks Slava!)
- Reset locked_pages = 0 immediately when the page array is discarded,
simplifying patch #5 ("ceph: Assert writeback loop invariants")
- Reword "as evidenced by the previous two patches which fix oopses" to
"as evidenced by two recent patches which fix oopses" and refer to the
patches by subject (being in the same series, I cannot refer to them by hash)
Warm regards,
Sam
[1] https://lore.kernel.org/all/20260107210139.40554-1-CFSworks@gmail.com/
[2] https://lore.kernel.org/all/20260126022715.404984-1-CFSworks@gmail.com/
Sam Edwards (4):
ceph: do not propagate page array emplacement errors as batch errors
ceph: fix write storm on fscrypted files
ceph: remove error return from ceph_process_folio_batch()
ceph: assert writeback loop invariants
fs/ceph/addr.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
--
2.52.0
Powered by blists - more mailing lists