lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070511110301.GB28425@in.ibm.com>
Date:	Fri, 11 May 2007 16:33:01 +0530
From:	Suparna Bhattacharya <suparna@...ibm.com>
To:	David Chinner <dgc@....com>
Cc:	"Amit K. Arora" <aarora@...ux.vnet.ibm.com>, torvalds@...l.org,
	akpm@...ux-foundation.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
	xfs@....sgi.com, cmm@...ibm.com
Subject: Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

On Fri, May 11, 2007 at 08:39:50AM +1000, David Chinner wrote:
> On Thu, May 10, 2007 at 05:26:20PM +0530, Amit K. Arora wrote:
> > On Thu, May 10, 2007 at 10:59:26AM +1000, David Chinner wrote:
> > > On Wed, May 09, 2007 at 09:31:02PM +0530, Amit K. Arora wrote:
> > > > I have the updated patches ready which take care of Andrew's comments.
> > > > Will run some tests and post them soon.
> > > > 
> > > > But, before submitting these patches, I think it will be better to
> > > > finalize on certain things which might be worth some discussion here:
> > > > 
> > > > 1) Should the file size change when preallocation is done beyond EOF ?
> > > > - Andreas and Chris Wedgwood are in favor of not changing the file size
> > > > in this case. I also tend to agree with them. Does anyone has an
> > > > argument in favor of changing the filesize ?  If not, I will remove the
> > > > code which changes the filesize, before I resubmit the concerned ext4
> > > > patch.
> > > 
> > > I think there needs to be both. If we don't have a mechanism to atomically
> > > change the file size with the preallocation, then applications that use
> > > stat() to work out if they need to preallocate more space will end up
> > > racing.
> > 
> > By "both" above, do you mean we should give user the flexibility if it wants
> > the filesize changed or not ? It can be done by having *two* modes for
> > preallocation in the system call - say FA_PREALLOCATE and FA_ALLOCATE. If we
> > use FA_PREALLOCATE mode, fallocate() will allocate blocks, but will not
> > change the filesize and [cm]time. If FA_ALLOCATE mode is used, fallocate()
> > will change the filesize if required (i.e.  when allocation is beyond EOF)
> > and also update [cm]time.  This way, the application can decide what it
> > wants.
> 
> Yes, that's right.
> 
> > This will be helpfull for the partial allocation scenario also. Think of the
> > case when we do not change the filesize in fallocate() and expect
> > applications/posix_fallocate() to do ftruncate() after fallocate() for this.
> > Now if fallocate() results in a partial allocation with -ENOSPC error
> > returned, applications/posix_fallocate() will not know for what length
> > ftruncate() has to be called.  :(
> 
> Well, posix_fallocate() either gets all the space or it fails. If
> you truncate to extend the file size after an ENOSPC, then that is
> a buggy implementation.
> 
> The same could be said for any application, or even the fallocate()
> call itself if it changes the filesize without having completely
> preallocated the space asked....
> 
> > Hence it may be a good idea to give user the flexibility if it wants to
> > atomically change the file size with preallocation or not. But, with more
> > flexibility there comes inconsistency in behavior, which is worth
> > considering.
> 
> We've got different modes to specify different behaviour. That's
> what the mode field was put there for in the first place - the
> interface is *designed* to support different preallocation
> behaviours....
> 
> > > > 2) For FA_UNALLOCATE mode, should the file system allow unallocation of
> > > > normal (non-preallocated) blocks (blocks allocated via regular
> > > > write/truncate operations) also (i.e. work as punch()) ?
> > > 
> > > Yes. That is the current XFS implementation for XFS_IOC_UNRESVSP, and what
> > > i did for FA_UNALLOCATE as well.
> > 
> > Ok. But, some people may not expect/like this. I think, we can keep it on
> > the backburner for a while, till other issues are sorted out.
> 
> How can it be a "backburner" issue when it defines the
> implementation?  I've already implemented some thing in XFS that
> sort of does what I think that the interface is supposed to do, but
> I need that interface to be nailed down before proceeding any
> further.
> 
> All I'm really interested in right now is that the fallocate
> _interface_ can be used as a *complete replacement* for the
> pre-existing XFS-specific ioctls that are already used by
> applications.  What ext4 can or can't do right now is irrelevant to
> this discussion - the interface definition needs to take priority
> over implementation....

Would you like to write up an interface definition description (likely
man page) and post it for review, possibly with a mention of apps using
it today ?

One reason for introducing the mode parameter was to allow the interface to
evolve incrementally as more options / semantic questions are proposed, so
that we don't have to make all the decisions right now. 
So it would be good to start with a *minimal* definition, even just one mode.
The rest could follow as subsequent patches, each being reviewed and debated
separately. Otherwise this discussion can drag on for a long time.

Regards
Suparna

> 
> Cheers,
> 
> Dave,
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Suparna Bhattacharya (suparna@...ibm.com)
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ