[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240403175838.GB2534900@fedora>
Date: Wed, 3 Apr 2024 13:58:38 -0400
From: Stefan Hajnoczi <stefanha@...hat.com>
To: Eric Blake <eblake@...hat.com>
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
Alasdair Kergon <agk@...hat.com>,
Mikulas Patocka <mpatocka@...hat.com>, dm-devel@...ts.linux.dev,
David Teigland <teigland@...hat.com>,
Mike Snitzer <snitzer@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Christoph Hellwig <hch@....de>, Joe Thornber <ejt@...hat.com>
Subject: Re: [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support
On Wed, Apr 03, 2024 at 12:02:19PM -0500, Eric Blake wrote:
> On Wed, Apr 03, 2024 at 10:11:47AM -0400, Stefan Hajnoczi wrote:
> ...
> > > > +static loff_t dm_blk_do_seek_hole_data(struct dm_table *table, loff_t offset,
> > > > + int whence)
> > > > +{
> > > > + struct dm_target *ti;
> > > > + loff_t end;
> > > > +
> > > > + /* Loop when the end of a target is reached */
> > > > + do {
> > > > + ti = dm_table_find_target(table, offset >> SECTOR_SHIFT);
> > > > + if (!ti)
> > > > + return whence == SEEK_DATA ? -ENXIO : offset;
> > >
> > > ...but this blindly returns offset for SEEK_HOLE, even when offset is
> > > beyond the end of the dm. I think you want 'return -ENXIO;'
> > > unconditionally here.
> >
> > If the initial offset is beyond the end of the table, then SEEK_HOLE
> > should return -ENXIO. I agree that the code doesn't handle this case.
> >
> > However, returning offset here is correct when there is data at the end
> > with SEEK_HOLE.
> >
> > I'll update the code to address the out-of-bounds offset case, perhaps
> > by checking the initial offset before entering the loop.
>
> You are correct that if we are on the second loop iteration of
> SEEK_HOLE (because the first iteration saw all data), then we have
> found the offset of the start of a hole and should return that offset,
> not -ENXIO. This may be a case where we just have to be careful on
> whether the initial pass might have any corner cases different from
> later times through the loop, and that we end the loop with correct
> results for both SEEK_HOLE and SEEK_DATA.
>
> >
> > >
> > > > +
> > > > + end = (ti->begin + ti->len) << SECTOR_SHIFT;
> > > > +
> > > > + if (ti->type->seek_hole_data)
> > > > + offset = ti->type->seek_hole_data(ti, offset, whence);
> > >
> > > Are we guaranteed that ti->type->seek_hole_data will not return a
> > > value exceeding end? Or can dm be used to truncate the view of an
> > > underlying device, and the underlying seek_hold_data can now return an
> > > answer beyond where dm_table_find_target should look for the next part
> > > of the dm's view?
> >
> > ti->type->seek_hole_data() must not return a value larger than
> > (ti->begin + ti->len) << SECTOR_SHIFT.
>
> Worth adding as documentation then.
>
> >
> > >
> > > In which case, should the blkdev_seek_hole_data callback be passed a
> > > max size parameter everywhere, similar to how fixed_size_llseek does
> > > things?
> > >
> > > > + else
> > > > + offset = dm_blk_seek_hole_data_default(offset, whence, end);
> > > > +
> > > > + if (whence == SEEK_DATA && offset == -ENXIO)
> > > > + offset = end;
> > >
> > > You have a bug here. If I have a dm contructed of two underlying targets:
> > >
> > > |A |B |
> > >
> > > and A is all data, then whence == SEEK_HOLE will have offset = -ENXIO
> > > at this point, and you fail to check whether B is also data. That is,
> > > you have silently treated the rest of the block device as data, which
> > > is semantically not wrong (as that is always a safe fallback), but not
> > > optimal.
> > >
> > > I think the correct logic is s/whence == SEEK_DATA &&//.
> >
> > No, with whence == SEEK_HOLE and an initial offset in A, the new offset
> > will be (A->begin + A->end) << SECTOR_SHIFT. The loop will iterate and
> > continue seeking into B.
> >
> > The if statement you commented on ensures that we also continue looping
> > with whence == SEEK_DATA, because that would otherwise prematurely end
> > with the new offset = -ENXIO.
> >
> > >
> > > > + } while (offset == end);
> > >
> > > I'm trying to make sure that we can never return the equivalent of
> > > lseek(dm, 0, SEEK_END). If you make my above suggested changes, we
> > > will iterate through the do loop once more at EOF, and
> > > dm_table_find_target() will then fail to match at which point we do
> > > get the desired -ENXIO for both SEEK_HOLE and SEEK_DATA.
> >
> > Wait, lseek() is supposed to return the equivalent of lseek(dm, 0,
> > SEEK_END) when whence == SEEK_HOLE and there is data at the end.
>
> It was confusing enough for me to write my initial review, I apologize
> if I'm making it harder for you.
No worries, if my code is hard to understand I can learn from your
feedback.
> Yes, we want to ensure that:
>
> off1 = lseek(fd, -1, SEEK_END);
> off2 = off1 + 1; // == lseek(fd, 0, SEEK_END)
>
> if off1 belongs to a data extent:
> - lseek(fd, off1, SEEK_DATA) == off1
> - lseek(fd, off1, SEEK_HOLE) == off2
> - lseek(fd, off2, SEEK_DATA) == -ENXIO
> - lseek(fd, off2, SEEK_HOLE) == -ENXIO
Agreed.
> if off1 belongs to a hole:
> - lseek(fd, off1, SEEK_DATA) == -ENXIO
> - lseek(fd, off1, SEEK_HOLE) == off1
> - lseek(fd, off2, SEEK_DATA) == -ENXIO
> - lseek(fd, off2, SEEK_HOLE) == -ENXIO
Agreed.
>
> Anything in my wall of text from the earlier message inconsistent with
> this table can be ignored; but at the same time, I was not able to
> quickly convince myself that your code properly had those properties,
> even after writing up the table.
>
> Reiterating what I said elsewhere, it may be smarter to document that
> for callbacks, it is wiser to require intermediate behavior that the
> input value 'offset' is always between the half-open range
> [ti->begin<<SECTOR_SHIFT, (ti->begin+ti->len)<<SECTOR_SHIFT), and on
> success, the output must be in the fully-closed range [offset,
> (ti->begin+ti->len)<<SECTOR_SHIFT], errors like -EIO are permitted but
> -ENXIO should not be returned; and let the caller worry about
> synthesizing -ENXIO from that (since the caller knows whether or not
> there is a successor ti where adjacency concerns come into play).
>
> That is, we can never pass in off2 (beyond the bounds of the table),
> and when passing in off1, I think this interface may be easier to work
> with in the intermediate layers, even though it differs from the
> lseek() interface above. For off1 in data:
> - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off1
> - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off2
> and for a hole:
> - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off2
> - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off1
I'll take a look again starting from block/fops.c, through dm.c, and
into dm-linear.c to see how to make things clearest. Although I would
like to have the same semantics for every seek function, maybe in the
end your suggestion will make the code clearer. Let's see.
Stefan
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists