linux-kernel - Re: ftruncate-mmap: pages are lost after writing to mmaped file.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090318151157.85109100.akpm@linux-foundation.org>
Date:	Wed, 18 Mar 2009 15:11:57 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Ying Han <yinghan@...gle.com>
Cc:	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>, guichaz@...il.com,
	Alex Khesin <alexk@...gle.com>,
	Mike Waychison <mikew@...gle.com>,
	Rohit Seth <rohitseth@...gle.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.

On Wed, 18 Mar 2009 12:44:08 -0700 Ying Han <yinghan@...gle.com> wrote:

> We triggered the failure during some internal experiment with
> ftruncate/mmap/write/read sequence. And we found that some pages are
> "lost" after writing to the mmaped file. which in the following test
> cases (count >= 0).
> 
> First we deployed the test cases into group of machines and see about
> >20% failure rate on average. Then, I did couple of experiment to try
> to reproduce it on a single machine. what i found is that:
> 1. add a fsync after write the file, i can not reproduce this issue.
> 2. add memory pressure(mmap/mlock) while run the test in infinite
> loop, the failure is reproduced quickly. ( background flushing ? )
> 
> The "bad pages" count differs each time from one digit to 4,5 digit
> for 128M ftruncated file. and what i also found that the bad page
> number are contiguous for each segment which total bad pages container
> several segments. ext "1-4, 9-20, 48-50" (  batch flushing ? )
> 
> (The failure is reproduced based on 2.6.29-rc8, also happened on
> 2.6.18 kernel. . Here is the simple test case to reproduce it with
> memory pressure. )

Thanks.  This will be a regression - the testing I did back in the days
when I actually wrote stuff would have picked this up.

Perhaps it is a 2.6.17 thing.  Which, IIRC, is when we made the changes to
redirty pages on each write fault.  Or maybe it was something else.

Nick, Peter: I'm in .au at preset, not able to build and run kernels - is
this something you'd have time to look into please?

Given the amount of time for which this bug has existed, I guess it isn't a
2.6.29 blocker, but once we've found out the cause we should have a little
post-mortem to work out how a bug of this nature has gone undetected for so
long.


> #include <sys/mman.h>
> #include <sys/types.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> 
> long kMemSize  = 128 << 20;
> int kPageSize = 4096;
> 
> int main(int argc, char **argv) {
> 	int status;
> 	int count = 0;
> 	int i;
> 	char *fname = "/root/test.mmap";
> 	char *mem;
> 
> 	unlink(fname);
> 	int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600);
> 	status = ftruncate(fd, kMemSize);
> 
> 	mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> 	// Fill the memory with 1s.
> 	memset(mem, 1, kMemSize);
> 
> 	for (i = 0; i < kMemSize; i++) {
> 		int byte_good = mem[i] != 0;
> 
> 		if (!byte_good && ((i % kPageSize) == 0)) {
> 			//printf("%d ", i / kPageSize);
> 			count++;
> 		}
> 	}
> 
> 	munmap(mem, kMemSize);
> 	close(fd);
> 	unlink(fname);
> 
> 	if (count > 0) {
> 		printf("Running %d bad page\n", count);
> 		return 1;
> 	}
> 	return 0;
> }
> 
> --Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/