[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200320053451.B7AD0AE04D@d06av26.portsmouth.uk.ibm.com>
Date: Fri, 20 Mar 2020 11:04:50 +0530
From: Ritesh Harjani <riteshh@...ux.ibm.com>
To: linux-ext4@...r.kernel.org, "Theodore Y. Ts'o" <tytso@....edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Jan Kara <jack@...e.cz>
Subject: Re: Ext4 corruption with VM images as 3 > drop_caches
On 3/19/20 6:54 PM, Ritesh Harjani wrote:
>
>
> On 3/18/20 9:17 AM, Aneesh Kumar K.V wrote:
>> Hi,
>>
>> With new vm install I am finding corruption with the vm image if I
>> follow up the install with echo 3 > /proc/sys/vm/drop_caches
>>
>> The file system reports below error.
>>
>> Begin: Running /scripts/local-bottom ... done.
>> Begin: Running /scripts/init-bottom ...
>> [ 4.916017] EXT4-fs error (device vda2): ext4_lookup:1700: inode
>> #787185: comm sh: iget: checksum invalid
>> done.
>> [ 5.244312] EXT4-fs error (device vda2): ext4_lookup:1700: inode
>> #917954: comm init: iget: checksum invalid
>> [ 5.257246] EXT4-fs error (device vda2): ext4_lookup:1700: inode
>> #917954: comm init: iget: checksum invalid
>> /sbin/init: error while loading shared libraries: libc.so.6: cannot
>> open shared object file: Error 74
>> [ 5.271207] Kernel panic - not syncing: Attempted to kill init!
>> exitcode=0x00007f00
>>
>> And debugfs reports
>>
>> debugfs: stat <917954>
>> Inode: 917954 Type: bad type Mode: 0000 Flags: 0x0
>> Generation: 0 Version: 0x00000000
>> User: 0 Group: 0 Size: 0
>> File ACL: 0
>> Links: 0 Blockcount: 0
>> Fragment: Address: 0 Number: 0 Size: 0
>> ctime: 0x00000000 -- Wed Dec 31 18:00:00 1969
>> atime: 0x00000000 -- Wed Dec 31 18:00:00 1969
>> mtime: 0x00000000 -- Wed Dec 31 18:00:00 1969
>> Size of extra inode fields: 0
>> Inode checksum: 0x00000000
>> BLOCKS:
>> debugfs:
>>
>> Bisecting this finds
>> Commit 244adf6426ee31a83f397b700d964cff12a247d3("ext4: make
>> dioread_nolock the default")
>> as bad. If I revert the same on top of linus
>> upstream(fb33c6510d5595144d585aa194d377cf74d31911)
>> I don't hit the corrupttion anymore.
>
> Tried replicating this and could easily replicate it on Power box.
> I tried to reproduce this on x86 too, but could not reproduce on x86.
> Now one difference on Power could be that pagesize is 64K and fs
> blocksize is 4K.
>
> The issue looks like the guest qemu image file is not properly written
> back, after host does echo 3 > drop_caches. (correct me if this is not
> the case).
Ok. So tried this issue with passing "cache=directsync" parameter to
drive file. This parameter says it should bypass the host side page
cache. With this parameter, I don't see this issue on Power box.
-ritesh
>
> I tried replicating via below test, but it could not reproduce.
>
> Any idea what kind of unit test could be written for this?
> I am not sure how exactly qemu is writing to it's image file.
>
>
> 1. Create 2 files. "mmap-file", "mmap-data".
> 2. "mmap-file" is a 2GB sparse file. Then at some random offsets (tried
> with both 64KB align and 4KB align offsets), try to write
> pagesize/blocksize amount of known data pattern.
> 3. These offsets (which are pagesize/blocksize align) are recorded into
> "mmap-data" file via normal read/write calls.
> 4. Then after we wrote to both files, we munmap the "mmap-file" and
> close both of these files.
> 5. Then we do echo 3 > drop_caches.
> 6. Then in the verify phase, using the offsets written in "mmap-data"
> file, I read the "mmap-file" to verify if it's contents are proper or
> not.
> With that could not reproduce this issue.
>
>
> -ritesh
>
>
Powered by blists - more mailing lists