[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 6 Jun 2012 05:24:16 +0000
From: Ming Lei <Ming.Lei@...erbed.com>
To: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: ext4 corruption during unexpected power cycle in the middle of
writing
I ran the power cycle test during the middle of file writing and after bootup, I ran force fsck and found two errors (If I run fsck -p -v I don't see the errors). From what I saw I think it is file system meta data corruption. Fsck can repair it but each time I ran the same test and I hit the same issue.
I don't think it is relevant but want to point out that sda6 shares the same drive as another partition on sda(sda3) is used for the raid6 array for /var.
The same issue was found whenever barrier is on or off, and the disk drive write cache is enabled or disabled. The test result shown below is when barrier is on and disk write cache is disabled.
I use kernel version 2.6.32SL6 version. I also see the same issue on 2.6.9 based kernel on the same hardware with ext3 file system.
My question is:
1) Is the issue caused from something unique in my box? Configuration error?
2) Is it possible my version of fsck reported false errors?
3) Is this a known issue? ? Is it a kernel bug?
4) How do I find what's wrong?
Thanks.
[admin@...ef-csa81 ~]# uname -a
Linux chief-csa81 2.6.32 #1 SMP Tue May 29 00:36:37 PDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[admin@...ef-csa81 ~]# mount
/dev/md1 on /var type ext3 (rw,relatime,user_xattr,acl,barrier=1,data=ordered)
/dev/sda6 on /data type ext4 (rw,nodiratime,relatime,barrier=1,data=ordered)
[admin@...ef-csa81 ~]# sdparm /dev/sda
/dev/sda: ATA WDC WD5003ABYX-0 1S01
Read write error recovery mode page:
AWRE 1 [cha: n]
ARRE 0 [cha: n]
PER 0 [cha: n]
Caching (SBC) mode page:
WCE 0 [cha: y]
RCD 0 [cha: n]
Control mode page:
SWP 0 [cha: n]
Informational exceptions control mode page:
EWASC 0 [cha: n]
DEXCPT 0 [cha: n]
MRIE 6 [cha: n]
[admin@...ef-csa81 ~]# cbt -p /data -x 1337 -s 400GB
---------------------------------------------------
Build info
Version: v2.2.5
Time: May 29 2012 01:51:41
Runtime info
Sysname: Linux
Nodename: chief-csa81
Release: 2.6.32
Version: #1 SMP Tue May 29 00:36:37 PDT 2012
Machine: x86_64
Time: Tue Jun 5 15:02:55 2012
Invocation
cbt -p /data -x 1337 -s 400GB
Configuration
General Options:
-r,--report-after-bytes = 500MB
-R,--report-after-seconds = 0
Output Size/Dimensions:
-p,--output-path = /data
-s,--output-size = 400GB
-n,--max-streams = 1
-S,--output-file-size = 18GB
-w,--write-size =
--oow-max-distance =
Data Options:
--data-mode = s
-x,--random-seed = 1337
-z,--lz-factor = 3.5425
-D,--ddup-factor = 1.029
-c,--churn-pct = 0
-X,--churn-pct-rand-seed = 1338933774
-A,--ddup-fanout-algorithm = CLUSTERED_RUNS
--ddup-mean-run-size-in-segments = 1.3
--ddup-mean-unique-run-size-in-segments = 1200
--ddup-run-size-multiple = 64
--ddup-min-distance-from-last-ddup = 120KB
--ddup-min-distance-from-cursor = 160MB
--ddup-ave-dist-for-nearby-segs = 120KB
--ddup-random-jump-every-n-ddup-runs = 12.6344
Segmentation Options:
--type = rk
--rk-seg-size-min = 512
--rk-seg-size-max = 32KB
--rk-seg-size-ave = 8KB
--rk-seg-size-dist = exponential
Number of cut points (-C) being used: 3045800
---------------------------------------------------
Number of files to write: 23
New_file: 0, stream 0, /data/cbt_output.0, size: 18GB
Total: 2.284 sec (0:00:02), 500.001 MB, 218.926 MBps - Curr: 2.284 sec, 500.001 MB, 218.926 MBps - Tue Jun 5 15:02:57 2012
Total: 4.822 sec (0:00:04), 1000.002 MB, 207.397 MBps - Curr: 2.538 sec, 500.001 MB, 197.022 MBps - Tue Jun 5 15:03:00 2012
Total: 8.275 sec (0:00:08), 1500.004 MB, 181.263 MBps - Curr: 3.454 sec, 500.002 MB, 144.776 MBps - Tue Jun 5 15:03:03 2012
====================== Power cycle during the writing ============================
========== on next boot, first fsck found 2 errors and then next fsck passed ===============================
[admin@...ef-csa81 ~]# fsck.ext4 -v -f -y /dev/sda6
e2fsck 1.42 (29-Nov-2011)
/dev/sda6: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (118366120, counted=76269471).
Fix? yes
Free inodes count wrong (30081013, counted=30081004).
Fix? yes
/dev/sda6: ***** FILE SYSTEM WAS MODIFIED *****
20 inodes used (0.00%)
0 non-contiguous files (0.0%)
0 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 3/9
44035281 blocks used (36.60%)
0 bad blocks
10 large files
9 regular files
2 directories
0 character device files
0 block device files
0 fifos
0 links
0 symbolic links (0 fast symbolic links)
0 sockets
--------
11 files
[admin@...ef-csa81 ~]# fsck.ext4 -v -f -y /dev/sda6
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
20 inodes used (0.00%)
0 non-contiguous files (0.0%)
0 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 3/9
44035281 blocks used (36.60%)
0 bad blocks
10 large files
9 regular files
2 directories
0 character device files
0 block device files
0 fifos
0 links
0 symbolic links (0 fast symbolic links)
0 sockets
--------
11 files
====== Hardware Info =====================================
The box has 2 Quad-Core AMD Opteron(tm) Processor 2378 stepping 02 and LSI sata controller.
Fusion MPT base driver 4.28.00.00
Copyright (c) 1999-2010 LSI Corporation
Fusion MPT SPI Host driver 4.28.00.00
Fusion MPT SAS Host driver 4.28.00.00
ACPI: PCI Interrupt Link [LNEC] enabled at IRQ 17
mptsas 0000:02:00.0: PCI INT A -> Link[LNEC] -> GSI 17 (level, low) -> IRQ 17
mptbase: ioc0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total memory = 8114680 kB
mptbase: ioc0: Initiating bringup
ioc0: LSISAS1068E B2: Capabilities={Initiator}
scsi0 : ioc0: LSISAS1068E B2, FwRev=011f0000h, Ports=1, MaxQ=277, IRQ=17
mptsas: ioc0: add expander: num_phys 12, sas_addr (0x3000000403f)
mptsas: ioc0: add expander: num_phys 12, sas_addr (0x3000000427f)
mptsas: ioc0: attaching sata device: fw_channel 0, fw_id 37, phy 0, sas_addr 0x30000004030
scsi target0:0:0: mptsas: ioc0: add device: fw_channel 0, fw_id 37, phy 0, sas_addr 0x30000004030scsi 0:0:0:0: Direct-Access ATA WDC WD5003ABYX-0 1S01 PQ: 0 ANSI: 5
scsi 0:0:0:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
sdev dma_alignment 511
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: [sda] Attached SCSI disk
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists