linux-ext4 - Re: Data loss/corruption when using fallocate/ftruncate.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1249938639.4136.6.camel@mingming-laptop>
Date:	Mon, 10 Aug 2009 14:10:39 -0700
From:	Mingming <cmm@...ibm.com>
To:	Frank Mayhar <fmayhar@...gle.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: Data loss/corruption when using fallocate/ftruncate.

On Mon, 2009-08-10 at 11:59 -0700, Frank Mayhar wrote:
> Hello again, folks.  We've got an app that needs to use O_DIRECT for
> performance and is using fallocate() to make sure the files are all in
> one extent.  Unfortunately the end size isn't always the fallocated size
> so it has to do a truncate when it's done; the sequence is generally:
> 
> 	create(file)
> 	fallocate(file, KEEP_SIZE, 0, maxlen)
> 	write/write/write/write...
> 	fallocate(file, 0, 0. maxlen-minus a bit)
> 	ftruncate(file, actual-len)
> 
> We've been seeing some of these files end up all or partly zero after
> (but not before) the truncate.  After further analysis, it's clear that
> the last extent (possibly the only extent) is being marked uninit for
> some reason.  The actual blocks on disk are nonzero but due to the
> extent being marked uninit they are being read as zero.
> 
> Note that this isn't easy to reproduce; lots of other stuff is going on
> when this happens.  Our feeling is that there's a race somewhere, quite
> possibly between fallocate and ftruncate, but it's not clear.  Certainly
> a single-threaded application doesn't see this, nor does an application
> that uses mutexes to serialize access to the file.
> 
> This is a heads-up to point out a real problem.  We're still analyzing
> and trying to track down the bug but it may take a little while.


Which kernel you are running? Two month ago a similar data "lose" issue
caused by mismark an previously-preallocated-but-later-filled trunk as
uninitialized after truncate. The following patch has been in 2.6.31-rc1
http://lists.openwall.net/linux-ext4/2009/06/10/30

Mingming


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html