[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251029161447.GG3356773@frogsfrogsfrogs>
Date: Wed, 29 Oct 2025 09:14:47 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: Bart Van Assche <bart.vanassche@...il.com>
Cc: Christoph Hellwig <hch@....de>, Carlos Maiolino <cem@...nel.org>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
linux-kernel@...r.kernel.org, linux-xfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-raid@...r.kernel.org,
linux-block@...r.kernel.org
Subject: Re: fall back from direct to buffered I/O when stable writes are
required
On Wed, Oct 29, 2025 at 08:58:52AM -0700, Bart Van Assche wrote:
> On 10/29/25 12:15 AM, Christoph Hellwig wrote:
> > we've had a long standing issue that direct I/O to and from devices that
> > require stable writes can corrupt data because the user memory can be
> > modified while in flight. This series tries to address this by falling
> > back to uncached buffered I/O. Given that this requires an extra copy it
> > is usually going to be a slow down, especially for very high bandwith
> > use cases, so I'm not exactly happy about.
> >
> > I suspect we need a way to opt out of this for applications that know
> > what they are doing, and I can think of a few ways to do that:
> >
> > 1a) Allow a mount option to override the behavior
> >
> > This allows the sysadmin to get back to the previous state.
> > This is fairly easy to implement, but the scope might be to wide.
/me dislikes mount options because getting rid of them is hard.
> > 1b) Sysfs attribute
> >
> > Same as above. Slightly easier to modify, but a more unusual
> > interface.
> >
> > 2) Have a per-inode attribute
> >
> > Allows to set it on a specific file. Would require an on-disk
> > format change for the usual attr options.
> >
> > 3) Have a fcntl or similar to allow an application to override it
> >
> > Fine granularity. Requires application change. We might not
> > allow any application to force this as it could be used to inject
> > corruption.
> >
> > In other words, they are all kinda horrible.
Yeah, I don't like the choices either. Bart's prctl sounds the least
annoying but even then I still don't like "I KNOW WHAT I'M DOING!!"
flags.
> Hi Christoph,
>
> Has the opposite been considered: only fall back to buffered I/O for buggy
> software that modifies direct I/O buffers before I/O has
> completed?
How would xfs detect that? For all we know the dio buffer is actually a
piece of device memory or something, and some hardware changed the
memory without the kernel knowing that. Later on the raid scrub fails a
parity check and it's far too late to do anything about it.
--D
> Regarding selecting the direct I/O behavior for a process, how about
> introducing a new prctl() flag and introducing a new command-line
> utility that follows the style of ionice and sets the new flag before
> any code runs in the started process?
>
> Thanks,
>
> Bart.
>
Powered by blists - more mailing lists