linux-kernel - Re: [PATCH] xfs: report a writeback error on a read() call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aFzFR6zD7X1_9bWj@dread.disaster.area>
Date: Thu, 26 Jun 2025 13:57:59 +1000
From: Dave Chinner <david@...morbit.com>
To: Yafang Shao <laoar.shao@...il.com>
Cc: Jeff Layton <jlayton@...nel.org>, Christoph Hellwig <hch@...radead.org>,
	djwong@...nel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-xfs@...r.kernel.org,
	yc1082463@...il.com
Subject: Re: [PATCH] xfs: report a writeback error on a read() call

On Thu, Jun 26, 2025 at 10:41:47AM +0800, Yafang Shao wrote:
> On Wed, Jun 25, 2025 at 10:06 PM Jeff Layton <jlayton@...nel.org> wrote:
> >
> > On Wed, 2025-06-25 at 04:56 -0700, Christoph Hellwig wrote:
> > > On Wed, Jun 25, 2025 at 07:49:31AM -0400, Jeff Layton wrote:
> > > > Another idea: add a new generic ioctl() that checks for writeback
> > > > errors without syncing anything. That would be fairly simple to do and
> > > > sounds like it would be useful, but I'd want to hear a better
> > > > description of the use-case before we did anything like that.
> 
> As you mentioned earlier, calling fsync()/fdatasync() after every
> write() blocks the thread, degrading performance—especially on HDDs.
> However, this isn’t the main issue in practice.
> The real problem is that users typically don’t understand "writeback
> errors". If you warn them, "You should call fsync() because writeback
> errors might occur," their response will likely be: "What the hell is
> a writeback error?"
> 
> For example, our users (a big data platform) demanded that we
> immediately shut down the filesystem upon writeback errors. These
> users are algorithm analysts who write Python/Java UDFs for custom
> logic—often involving temporary disk writes followed by reads to pass
> data downstream. Yet, most have no idea how these underlying processes
> work.

And that's exactly why XFS originally never threw away dirty data on
writeback errors. Because scientists and data analysts that wrote
programs to chew through large amounts of data didn't care about
persistence of their data mid-processing. They just wanted what they
wrote to be there the next time the processing pipeline read it.

> > > That's what I mean with my above proposal, except that I though of an
> > > fcntl or syscall and not an ioctl.
> >
> > Yeah, a fcntl() would be reasonable, I think.
> >
> > For a syscall, I guess we could add an fsync2() which just adds a flags
> > field. Then add a FSYNC_JUSTCHECK flag that makes it just check for
> > errors and return.
> >
> > Personally, I like the fcntl() idea better for this, but maybe we have
> > other uses for a fsync2().
> 
> What do you expect users to do with this new fcntl() or fsync2()? Call
> fsync2() after every write()? That would still require massive
> application refactoring.

<sigh>

We already have a user interface that provides exactly the desired
functionality.

$ man sync_file_range
....
   Some details
       SYNC_FILE_RANGE_WAIT_BEFORE  and  SYNC_FILE_RANGE_WAIT_AFTER
       will  detect  any I/O errors or ENOSPC conditions and will
       return these to the caller.
....

IOWs, checking for a past writeback IO error is as simple as:

	if (sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE) < 0) {
		/* An unreported writeback error was pending on the file */
		wb_err = -errno;
		......
	}

This does not cause new IO to be issued, it only blocks on writeback
that is currently in progress, and it has no data integrity
requirements at all. If the writeback has already been done, all it
will do is sweep residual errors out to userspace.....

-Dave.
-- 
Dave Chinner
david@...morbit.com