linux-kernel - Re: [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <usjvvltfh6lk4ummqh3qyvd4gp3vzbmclvbcp2zqjhdwrswv7f@mmjwmfoyavxg>
Date: Wed, 3 Apr 2024 14:28:07 -0500
From: Eric Blake <eblake@...hat.com>
To: Stefan Hajnoczi <stefanha@...hat.com>
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Alasdair Kergon <agk@...hat.com>, Mikulas Patocka <mpatocka@...hat.com>, dm-devel@...ts.linux.dev, 
	David Teigland <teigland@...hat.com>, Mike Snitzer <snitzer@...nel.org>, Jens Axboe <axboe@...nel.dk>, 
	Christoph Hellwig <hch@....de>, Joe Thornber <ejt@...hat.com>
Subject: Re: [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support

On Wed, Apr 03, 2024 at 01:58:38PM -0400, Stefan Hajnoczi wrote:
> > Yes, we want to ensure that:
> > 
> > off1 = lseek(fd, -1, SEEK_END);
> > off2 = off1 + 1; // == lseek(fd, 0, SEEK_END)
> > 
> > if off1 belongs to a data extent:
> >   - lseek(fd, off1, SEEK_DATA) == off1
> >   - lseek(fd, off1, SEEK_HOLE) == off2
> >   - lseek(fd, off2, SEEK_DATA) == -ENXIO
> >   - lseek(fd, off2, SEEK_HOLE) == -ENXIO
> 
> Agreed.
> 
> > if off1 belongs to a hole:
> >   - lseek(fd, off1, SEEK_DATA) == -ENXIO
> >   - lseek(fd, off1, SEEK_HOLE) == off1
> >   - lseek(fd, off2, SEEK_DATA) == -ENXIO
> >   - lseek(fd, off2, SEEK_HOLE) == -ENXIO
> 
> Agreed.
> 
..
> > 
> > That is, we can never pass in off2 (beyond the bounds of the table),
> > and when passing in off1, I think this interface may be easier to work
> > with in the intermediate layers, even though it differs from the
> > lseek() interface above.  For off1 in data:
> >   - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off1
> >   - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off2
> > and for a hole:
> >   - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off2
> >   - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off1
>

In the caller, we already need to know both off1 and off2 (that is,
the offset we are asking about, AND the offset at which the underlying
component ends its map at, since that is where we are then in charge
of whether to concatenate that with the next component or give the
overall result).

If we already guarantee that we never call into a lower layer with
off2 (because it was beyond bounds), then the only difference between
the two semantics is whether SEEK_DATA in a final hole returns -ENXIO
or off2; and it looks like we can do a conversion from either style
into the other.

So designing the caller logic, it looks like I would want:

if off1 >= EOF return -ENXIO (out of bounds)

while off1 < EOF:

  determine off2 of current lower region
  at this point, we are guaranteed off1 < off2
  we also know that off2 < EOF unless we are on last lower region
  call result=lower_layer(off1, SEEK_X)
  it is a bug if result is non-negative but < off1, or if result > off2

  if result == off1, return result (we are already in the right HOLE
  or DATA)

  if result > off1 and < off2, return result (we had to advance to get
  to the right region, but the right region was within the lower
  layer, and we don't need to inspect further layers)

  Note - the above two paragraphs can be collapsed into one: if result
  < off2, return result (the current subregion gave us an adequate
  answer)

  if result == off2, continue to the next region with off1=result (in
  first semantics, this is only possible for SEEK_HOLE for a lower
  region ending in data; in the second semantics, it is possible for
  either SEEK_HOLE or SEEK_DATA for a lower region ending in the type
  opposite of the request)

  if result == -ENXIO, continue to the next region by using off1=off2
  (only possible in the first semantics for SEEK_DATA on a lower
  region ending in a hole)

  any other result is necessarily a negative errno like -EIO; return it

if the loop completes without an early return, we are out of lower
regions, and we should be left with off1 == EOF.  Because we advanced,
we know now to:
return whence == SEEK_HOLE ? off1 : -ENXIO

> I'll take a look again starting from block/fops.c, through dm.c, and
> into dm-linear.c to see how to make things clearest. Although I would
> like to have the same semantics for every seek function, maybe in the
> end your suggestion will make the code clearer. Let's see.

Keeping lseek semantics may require a couple more lines of code
duplicated at each layer, but less maintainer headaches in the long
run of converting between slightly different semantics depending on
which layer you are at.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org