lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090608190013.GB4363@fsbox>
Date:	Mon, 8 Jun 2009 12:00:13 -0700
From:	Valerie Aurora <vaurora@...hat.com>
To:	Nick Dokos <nicholas.dokos@...com>
Cc:	"Theodore Ts'o" <tytso@....edu>, linux-ext4@...r.kernel.org,
	Eric Sandeen <sandeen@...hat.com>
Subject: Re: Some 64-bit tests

On Mon, Jun 08, 2009 at 09:57:21AM -0400, Nick Dokos wrote:
> I built and ran e2fsprogs bits from the pu branch from last week
> (not including the changes that you made yesterday.)
> 
> The basic cycle of mkfs/fill up the fs/fsck seemed to work without
> fatal errors but there are several problematic points.

That's great news!  Thanks.

> The mkfs looked like this:
> 
> ,----
> | $ sudo time mke2fs -q -t ext4 -O ^resize_inode -E stride=32,stripe-width=512 /dev/mapper/bigvg-bigvol
> | 64.02user 722.30system 13:14.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
> | 1240inputs+1026586096outputs (6major+317202minor)pagefaults 0swaps
> `----
> 
> I then ran the Lustre test that Andreas posted:
> 
> ,----
> | $ sudo time ~/src/tools/lustre/liverfs -l -r -w /mnt
> | Timestamp: 1243984976
> | -- 0:bash -- time-stamp -- Jun/02/09 19:24:49 --
> | -- 0:bash -- time-stamp -- Jun/03/09  9:42:50 --
> | write File name: /mnt/dir00240/file020          
> | write complete
> | 
> | liverfs: writing /mnt/liverfs.filecount failed :No space left on device
> | -- 0:bash -- time-stamp -- Jun/03/09  9:44:41 --
> | -- 0:bash -- time-stamp -- Jun/03/09 12:11:14 --
> | 
> | -- 0:bash -- time-stamp -- Jun/03/09 12:13:10 --
> | -- 0:bash -- time-stamp -- Jun/04/09  2:39:01 --
> | 374.48user 87720.31system 31:16:05elapsed 78%CPU (0avgtext+0avgdata 0maxresident)k
> | 64604538992inputs+64670728952outputs (3major+460minor)pagefaults 0swaps
> `----
> 
> roughly 14 hours to write and 17 hours to read everything back (the
> ENOSPC error message is an artifact of the program and does not affect
> the rest of the run). liverfs performs some consistency checking on the
> contents of the files, so the fact that it did not find anything wrong
> is encouraging.
> 
> It created 241 directories, each with 32 4GiB files in it (except the last
> one, which had 20 files). That comes out to about 30TiB which is OK.
> 
> The fsck looks like this:
> 
> ,----
> | root@...fter:~/src/tests/2009/06-03# e2fsck -t -t -n -f /dev/mapper/bigvg-bigvol
> | e2fsck 1.41.6 (30-May-2009)
> | Pass 1: Checking inodes, blocks, and sizes
> | Pass 1: Memory used: 31180k/18014398507629424k (31004k/177k), time: 384.17/294.25/ 2.24
> | Pass 1: I/O read: 63MB, write: 0MB, rate: 0.16MB/s
> | Pass 2: Checking directory structure
> | Pass 2: Memory used: 31180k/18014398508200200k (30993k/188k), time:  1.00/ 0.40/ 0.49
> | Pass 2: I/O read: 1MB, write: 0MB, rate: 1.00MB/s
> | Pass 3: Checking directory connectivity
> | Peak memory: Memory used: 31180k/18014398508450540k (30993k/188k), time: 389.75/298.39/ 3.52
> | Pass 3: Memory used: 31180k/18014398508200200k (30993k/188k), time:  0.28/ 0.12/ 0.16
> | Pass 3: I/O read: 1MB, write: 0MB, rate: 3.53MB/s
> | Pass 4: Checking reference counts
> | Pass 4: Memory used: 31180k/1520628k (30993k/188k), time: 70.32/70.17/ 0.13
> | Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
> | Pass 5: Checking group summary information
> | Pass 5: Memory used: 31212k/1270288k (30993k/220k), time: 409.82/270.69/ 5.29
> | Pass 5: I/O read: 979MB, write: 0MB, rate: 2.39MB/s
> | /dev/mapper/bigvg-bigvol: 7954/2050768896 files (0.0% non-contiguous), 8203066502/8203075584 blocks
> | Memory used: 31212k/1270288k (30993k/220k), time: 869.92/639.26/ 8.96
> | I/O read: 1058MB, write: 0MB, rate: 1.22MB/s
> | 
> | real	14m31.299s
> | user	10m39.257s
> | sys	0m10.336s
> `----
> 
> The "-t -t" part of the reporting may be truncating large quantities,
> and the "peaK" and "pass 3" memory seem bogus:
> 
>   Peak memory: Memory used: 31180k/18014398508450540k (30993k/188k), time: 389.75/298.39/ 3.52
>   Pass 3: Memory used: 31180k/18014398508200200k (30993k/188k), time:  0.28/ 0.12/ 0.16
> 
> The box has "only" 256GiB of memory and about 36GB of swap.

Part of this can be explained by overflow/wraparound/formatting bugs.
The bogus enormously large values look more like addresses than counters:

[val@...ox ~]$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
obase=16
18014398507629424
3FFFFFFFE3BB70
18014398508200200
3FFFFFFFEC7108
18014398508450540
3FFFFFFFF042EC
18014398508200200
3FFFFFFFEC7108

> In addition, filefrag seems to have some problems. It reports
> that every file has about 512 extents (most of them exactly 512, but a
> few with less than that -- as little as 205 -- and a few more with more
> than that -- as much as 1155. Since the program is single threaded, and
> nothing else is happening on the file system, I (naively?) expected
> maximal extents allocated (iiuc, that's 128MiB - so I'd expect 32
> extents for most of the files).

Eric Sandeen (cc'd) is who I usually send ext4 file fragmentation
problems to.  In my experience, ext4 never allocates just one extent
for a file, but always exactly 512 sounds interesting.  Eric?

> filefrag -v has problems:
> 
> # filefrag -v file010
> Filesystem type is: ef53
> File size of file010 is 4294967296 (1048576 blocks, blocksize 4096)
>  ext logical  physical  expected length flags
>    0       0  40931328             2048 
>    1    2048  40951808  40933375   2048 
>    2    4096  40970240  40953855   2048 
>    3    6144  40988672  40972287   2048 
>    4    8192  41007104  40990719   2048 
>    5   10240  41027584  41009151   2048 
>  ...   .....  ........  ........   ....
> 
>  217 1034240  49362944  49348607   2048 
>  218 1036288  49379328  49364991   2048 
>  219 1038336  49397760  49381375   2048 
>  220 1040384  49414144  49399807   2048 
>  221 1042432  49430528  49416191   2048 
>  222 1044480  49446912  49432575   2048 
>  223 1046528  49463296  49448959   2048 eof
> file010: 224 extents found
> 
> # filefrag file010
> file010: 512 extents found

That ought to help a lot narrowing down the bug.

Thanks,

-VAL
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ