[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ea373b5.1200b.16241b1deaf.Coremail.0121167@fudan.edu.cn>
Date: Tue, 20 Mar 2018 12:36:38 +0800 (GMT+08:00)
From: "Jidong Xiao" <0121167@...an.edu.cn>
To: "Theodore Y. Ts'o" <tytso@....edu>
Cc: "Andreas Dilger" <adilger@...ger.ca>, linux-ext4@...r.kernel.org,
"Jidong Xiao" <jidong.xiao@...il.com>
Subject: Re: Re: No data blocks at all in my ext4 journal
> -----Original Messages-----
> From: "Theodore Y. Ts'o" <tytso@....edu>
> Sent Time: 2018-03-20 12:12:31 (Tuesday)
> To: "Jidong Xiao" <0121167@...an.edu.cn>
> Cc: "Andreas Dilger" <adilger@...ger.ca>, linux-ext4@...r.kernel.org, "Jidong Xiao" <jidong.xiao@...il.com>
> Subject: Re: No data blocks at all in my ext4 journal
>
> First of all, can you try upgrading to the very latest version of
> e2fsprogs. You are using a very ancient version of e2fsprogs
> (1.42.13.wc5) which has also been patched for Lustre. If you use
> e2fsprogs 1.44.0, then at least we'll be testing on roughly the same
> version of e2fsprogs, just in case the issue is caused by how debugfs
> logdump works.
>
> Secondly, the file system is a really ancient one, with a very tiny
> journal (32M). These days we use a default of a much larger journal,
> which is shown to provide much better performance. (See section 4.1
> of [1].)
>
> [1] https://www.usenix.org/system/files/conference/fast17/fast17-aghayev.pdf
>
> It looks liket you are looking at a live file system, and it's possible
> that due to a combination of a small journal, journal wrapping, and an
> old version of debugfs/logdump is causing the confusion.
>
> So the other I would ask is that you try is to experiment on something
> on your live root file system, so you can run a more controlled
> experiment. To that end, please install kvm-xfstests or
> gce-xfstests[2]. Quick start instructions for kvm-xfstest are
> available at [3].
>
>
> [2] https://thunk.org/gce-xfstests
> [3] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
>
> This will allow you to run a controlled experiment, something like this:
>
> % kvm-xfstests --kernel /build/ext4-4.4 shell
> ....
> root@...-xfstests:~# mke2fs -Fq -t ext4 /dev/vdc
> /dev/vdc contains a ext4 file system
> last mounted on Mon Jan 1 10:52:47 2018
> root@...-xfstests:~# mount /dev/vdc
> root@...-xfstests:~# cp -r xfstests /vdc ; sync
> root@...-xfstests:~# C-a x <==== type control-A, followed by x to abort QEMU
> QEMU: Terminated
>
> % debugfs -R "logdump -ac" kvm-xfstests/disks/vdc > /tmp/logdump.out
> debugfs 1.44.0 (7-Mar-2018)
> % less /tmp/logdump.out
>
>
> This means you're using a standard test environment. You can use a
> kernel built from upstream sources (detailed instructions for doing
> this can be found at [4]), and the kvm-xfstests environment uses a
> standard Debian environment with a stock e2fsprogs (no random
> uncontrolled patches by Red Hat Enterprise Linux, and e2fsprogs with
> random Lustre patches). You'll also be looking at a aborted file
> system, as opposed to a file system which is live and potentially
> being modified in real time while you look at it with your tools.
>
> [4] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md
>
> This will be much easier than my trying to figure out what's going on
> your system. I am suspicious of the version of e2fsprogs, I'm
> suspicious of the fact that you are trying to examine a file system
> while it is mounted and potentially being modified. etc.
>
> I can tell you that using a standard upstream 4.4 kernrel, and a
> standard, unpatched, non-prehistoric version of e2fsprogs, probing a
> file system which is aborted and not being modified while I look at
> it, debugfs's logdump -ac shows me what I would expect.
>
> And if a RHEL kernel had a journal with the results that you had, if
> you pulled the power, and the journal was replayed, it would corrupt
> the whole file system. Since Red Hat Enterprise Linux users aren't
> complained of completely destroyed file systems after a power failure,
> I *know* your results must be somehow suspect. How, I'm not sure.
> But instead of trying to debug your random environment, why don't you
> try using a standard development/test environment?
>
> Regards,
>
> - Ted
Hi, Ted,
Thanks very much. I will follow your instructions and post update here.
But my journal size is not 32MB, it is 128MB I think. 0x00008000=32768, mean 32768 blocks, and the block size is 4K, so 32768*4K=32K*4K=128MB.
-Jidong
Powered by blists - more mailing lists