[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1108302049480.26762@asgard.lang.hm>
Date: Tue, 30 Aug 2011 20:53:29 -0700 (PDT)
From: david@...g.hm
To: Dave Chinner <david@...morbit.com>
cc: Sunil Mushran <sunil.mushran@...cle.com>,
Andreas Dilger <adilger@...ger.ca>,
Christoph Hellwig <hch@...radead.org>,
Josef Bacik <josef@...hat.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-btrfs@...r.kernel.org,
xfs@....sgi.com, viro@...IV.linux.org.uk, dchinner@...hat.com
Subject: Re: [PATCH] xfstests 255: add a seek_data/seek_hole tester
On Wed, 31 Aug 2011, Dave Chinner wrote:
> On Tue, Aug 30, 2011 at 06:17:02PM -0700, Sunil Mushran wrote:
>> On 08/25/2011 06:35 PM, Dave Chinner wrote:
>>> Agreed, that's the way I'd interpret it, too. So perhaps we need to
>>> ensure that this interpretation is actually tested by this test?
>>>
>>> How about some definitions to work by:
>>>
>>> Data: a range of the file that contains valid data, regardless of
>>> whether it exists in memory or on disk. The valid data can be
>>> preceeded and/or followed by an arbitrary number of zero bytes
>>> dependent on the underlying implementation of hole detection.
>>>
>>> Hole: a range of the file that contains no data or is made up
>>> entirely of NULL (zero) data. Holes include preallocated ranges of
>>> files that have not had actual data written to them.
>>>
>>> Does that make sense? It has sufficient flexibility in it for the
>>> existing generic "non-implementation", allows for filesystems to
>>> define their own hole detection boundaries (e.g. filesystem block
>>> size), and effectively defines how preallocated ranges from
>>> fallocate() should be treated (i.e. as holes). If we can agree on
>>> those definitions, I think that we should document them in both the
>>> kernel and the man page that defines SEEK_HOLE/SEEK_DATA so everyone
>>> is on the same page...
>>
>> We should not tie in the definition to existing fs technologies.
>
> Such as? If we don't use well known, well defined terminology, we
> end up with ambiguous, vague functionality and inconsistent
> implementations.
>
>> Instead
>> we should let the fs weigh the cost of providing accurate information
>> with the possible gain in performance.
>>
>> Data:
>> A range in a file that could contain something other than nulls.
>> If in doubt, it is data.
>>
>> Hole:
>> A range in a file that only contains nulls.
>
> And that's -exactly- the ambiguous, vague definition that has raised
> all these questions in the first place. I was in doubt about whether
> unwritten extents can be considered a hole, and by your definition
> that means it should be data. But Andreas seems to be in no doubt it
> should be considered a hole.
>
> Hence if I implement XFS support and Andreas implements ext4 support
> by your defintion, we end with vastly different behaviour even
> though the two filesystems use the same underlying technology for
> preallocated ranges. That's exactly the inconsistency in
> implementation that I'd like us to avoid.
>
> IOWs, the definition needs to be clear enough to prevent these
> inconsistencies from occurring. Indeed, the phrase "preallocated
> ranges that have not had data written to them" is as independent of
> filesystem implementation or technologies as possible. However,
> because Linux supports preallocation (unlike our reference
> platform), and we encourage developers to use it where appropriate,
> it is best that we define how we expect such ranges to behave
> clearly. That makes life easier for everyone.
Since a sparse file has the holes filled by nulls by definition, it seems
fairly clear that they chould count as holes. In fact, I would not be
surprised to see some filesystem _only_ report the unwritten pieces of
sparse files as holes (not any other ranges of nulls)
the question I have is how large does the range of nulls need to be before
it's reported as a hole? disk sectors, filesystem blocks, other?
David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists