[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ua7ib34kk5s6yfthqkgy3m2pnbk33a34g7prezmwl7hfwv6lwq@fljhjaogd6gq>
Date: Wed, 27 Aug 2025 17:20:53 +0200
From: Jan Kara <jack@...e.cz>
To: Ritesh Harjani <ritesh.list@...il.com>
Cc: Keith Busch <kbusch@...nel.org>, Jan Kara <jack@...e.cz>,
Keith Busch <kbusch@...a.com>, linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org, snitzer@...nel.org, axboe@...nel.dk,
dw@...idwei.uk, brauner@...nel.org, hch@....de, martin.petersen@...cle.com,
djwong@...nel.org, linux-xfs@...r.kernel.org, viro@...iv.linux.org.uk,
Jan Kara <jack@...e.com>, Brian Foster <bfoster@...hat.com>
Subject: Re: [PATCHv3 0/8] direct-io: even more flexible io vectors
On Tue 26-08-25 10:29:58, Ritesh Harjani wrote:
> Keith Busch <kbusch@...nel.org> writes:
>
> > On Mon, Aug 25, 2025 at 02:07:15PM +0200, Jan Kara wrote:
> >> On Fri 22-08-25 18:57:08, Ritesh Harjani wrote:
> >> > Keith Busch <kbusch@...a.com> writes:
> >> > >
> >> > > - EXT4 falls back to buffered io for writes but not for reads.
> >> >
> >> > ++linux-ext4 to get any historical context behind why the difference of
> >> > behaviour in reads v/s writes for EXT4 DIO.
> >>
> >> Hum, how did you test? Because in the basic testing I did (with vanilla
> >> kernel) I get EINVAL when doing unaligned DIO write in ext4... We should be
> >> falling back to buffered IO only if the underlying file itself does not
> >> support any kind of direct IO.
> >
> > Simple test case (dio-offset-test.c) below.
> >
> > I also ran this on vanilla kernel and got these results:
> >
> > # mkfs.ext4 /dev/vda
> > # mount /dev/vda /mnt/ext4/
> > # make dio-offset-test
> > # ./dio-offset-test /mnt/ext4/foobar
> > write: Success
> > read: Invalid argument
> >
> > I tracked the "write: Success" down to ext4's handling for the "special"
> > -ENOTBLK error after ext4_want_directio_fallback() returns "true".
> >
>
> Right. Ext4 has fallback only for dio writes but not for DIO reads...
>
> buffered
> static inline bool ext4_want_directio_fallback(unsigned flags, ssize_t written)
> {
> /* must be a directio to fall back to buffered */
> if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
> (IOMAP_WRITE | IOMAP_DIRECT))
> return false;
>
> ...
> }
>
> So basically the path is ext4_file_[read|write]_iter() -> iomap_dio_rw
> -> iomap_dio_bio_iter() -> return -EINVAL. i.e. from...
>
>
> if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
> !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
> return -EINVAL;
>
> EXT4 then fallsback to buffered-io only for writes, but not for reads.
Right. And the fallback for writes was actually inadvertedly "added" by
commit bc264fea0f6f "iomap: support incremental iomap_iter advances". That
changed the error handling logic. Previously if iomap_dio_bio_iter()
returned EINVAL, it got propagated to userspace regardless of what
->iomap_end() returned. After this commit if ->iomap_end() returns error
(which is ENOTBLK in ext4 case), it gets propagated to userspace instead of
the error returned by iomap_dio_bio_iter().
Now both the old and new behavior make some sense so I won't argue that the
new iomap_iter() behavior is wrong. But I think we should change ext4 back
to the old behavior of failing unaligned dio writes instead of them falling
back to buffered IO. I think something like the attached patch should do
the trick - it makes unaligned dio writes fail again while writes to holes
of indirect-block mapped files still correctly fall back to buffered IO.
Once fstests run completes, I'll do a proper submission...
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
View attachment "0001-ext4-Fail-unaligned-direct-IO-write-with-EINVAL.patch" of type "text/x-patch" (3256 bytes)
Powered by blists - more mailing lists