[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yxyuijjfd6yknryji2q64j3keq2ygw6ca6fs5jwyolklzvo45s@4u63qqqyosy2>
Date: Sun, 26 Jan 2025 18:01:55 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Theodore Ts'o <tytso@....edu>
Cc: Ext4 Developers List <linux-ext4@...r.kernel.org>,
Linux Kernel Developers List <linux-kernel@...r.kernel.org>, dave.hansen@...el.com, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] ext4: use private version of page_zero_new_buffers() for
data=journal mode
On Fri, Oct 09, 2015 at 12:01:09AM -0400, Theodore Ts'o wrote:
> If there is a error while copying data from userspace into the page
> cache during a write(2) system call, in data=journal mode, in
> ext4_journalled_write_end() were using page_zero_new_buffers() from
> fs/buffer.c. Unfortunately, this sets the buffer dirty flag, which is
> no good if journalling is enabled. This is a long-standing bug that
> goes back for years and years in ext3, but a combination of (a)
> data=journal not being very common, (b) in many case it only results
> in a warning message. and (c) only very rarely causes the kernel hang,
> means that we only really noticed this as a problem when commit
> 998ef75ddb caused this failure to happen frequently enough to cause
> generic/208 to fail when run in data=journal mode.
>
> The fix is to have our own version of this function that doesn't call
> mark_dirty_buffer(), since we will end up calling
> ext4_handle_dirty_metadata() on the buffer head(s) in questions very
> shortly afterwards in ext4_journalled_write_end().
>
> Thanks to Dave Hansen and Linus Torvalds for helping to identify the
> root cause of the problem.
>
Hello there, a blast from the past.
I see this has landed in b90197b655185a11640cce3a0a0bc5d8291b8ad2
I came here from looking at a pwrite vs will-it-scale and noticing that
pre-faulting eats CPU (over 5% on my Sapphire Rapids) due to SMAP trips.
It used to be that pre-faulting was avoided specifically for that
reason, but it got temporarily reverted due to bugs in ext4, to quote
Linus (see 00a3d660cbac05af34cca149cb80fb611e916935):
> The commit itself does not appear to be buggy per se, but it is exposing
> a bug in ext4 (and Ted thinks ext3 too, but we solved that by getting
> rid of it). It's too late in the release cycle to really worry about
> this, even if Dave Hansen has a patch that may actually fix the
> underlying ext4 problem. We can (and should) revisit this for the next
> release.
Given your patch landing I take it this is expected to be fixed now?
Sounds like nobody bothered to revert the revert. Not the end of the
world, but it is few % left on the table for (hopefully) no reason. ofc
testing will be needed, but that's what -next is for
thanks,
Powered by blists - more mailing lists