[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 07 Jul 2010 09:26:57 -0500
From: Eric Sandeen <sandeen@...hat.com>
To: Jens Axboe <jaxboe@...ionio.com>
CC: tytso@....edu, adilger@....com, linux-ext4@...r.kernel.org
Subject: Re: fio test triggering bad data on ext4
Jens Axboe wrote:
> Hi,
>
> I was writing a small fio job file to do writes and read verifies on a
> device. It forks 32 processes, each writing randomly to 4 files with a
> block size between 4k and 16k. When it has written 1024 of those blocks,
> it'll verify the oldest 512 of them. Each block is checksummed for every
> 512b. It uses libaio and O_DIRECT.
>
> It works on ext2 and btrfs. I haven't run it to completion yet, but they
> survive 15-20 minutes just fine. ext4 doesn't even go a full minutes
> before this triggers:
>
> Bad verify header 0 at 10137600
> fio: pid=9943, err=84/file:io_u.c:1212, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character
>
> writers: (groupid=0, jobs=32): err=84 (file:io_u.c:1212, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character): pid=9943
FYI:
I asked Jens to test hch's and Jiaying's aio completion patches with this,
and apparently those fixed this problem for him.
-Eric
> which tells us that where we expected to find the correct verify magic
> in the header, it was all zeroes. The job file used is below, and to
> reproduce you want to use the latest fio (1.40) since some earlier
> versions don't do verify_interval properly for non-pattern verifies. You
> can get fio here:
>
> http://brick.kernel.dk/snaps/fio-1.40.tar.gz
>
> or from git at:
>
> git://git.kernel.dk/fio.git
>
> The kernel used is 2.6.35-rc3 and I ran this on a raid0 that had 8 SSD
> drives.
>
> --- snip job file ---
>
> [global]
> direct=1
> group_reporting=1
> exitall
> runtime=4h
> time_based=1
>
> # writers, will repeatedly randomly write and verify data
> [writers]
> rw=randwrite
> bsrange=4k-16k
> ioengine=libaio
> iodepth=4
> directory=/data
> verify=crc32c
> verify_backlog=1024
> verify_backlog_batch=512
> verify_interval=512
> size=512m
> nrfiles=4
> filesize=64m-256m
> numjobs=32
> create_serialize=0
>
> --- snip job file ---
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists