linux-ext4 - Re: ext4 error when testing virtio-scsi & vhost-scsi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMj5BkicJhB1Bk6q_ZT_nMKSPWf1XEGchtnGnMPFy=O_0x91hA@mail.gmail.com>
Date:	Tue, 19 Jul 2016 16:21:43 +0800
From:	Zhangfei Gao <zhangfei.gao@...il.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	kvm@...r.kernel.org, qemu-devel@...gnu.org,
	target-devel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: ext4 error when testing virtio-scsi & vhost-scsi

On Tue, Jul 19, 2016 at 3:56 PM, Zhangfei Gao <zhangfei.gao@...il.com> wrote:
> Dear Ted
>
> On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <tytso@....edu> wrote:
>> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>>> Some update:
>>>
>>> If test with ext2, no problem in iblock.
>>> If test with ext4, ext4_mb_generate_buddy reported error in the
>>> removing files after reboot.
>>>
>>>
>>> root@(none)$ rm test
>>> [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 18
>>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>>> [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). Th
>>> ere's a risk of filesystem corruption in case of system crash.
>>>
>>> Any special notes of using ext4 in qemu?
>>
>> Ext4 has more runtime consistency checking than ext2.  So just because
>> ext4 complains doesn't mean that there isn't a problem with the file
>> system; it just means that ext4 is more likely to notice before you
>> lose user data.
>>
>> So if you test with ext2, try running e2fsck afterwards, to make sure
>> the file system is consistent.
>>
>> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
>> anything like this in a very long time, I suspect the problemb is with
>> your SCSI code, and not with ext4.
>>
>
> Do you know what's the possible reason of this error.
>
> Have tried 4.7-rc2, same issue exist.
> It can be reproduced by fileio and iblock as backstore.
> It is easier to happen in qemu like this process:
> qemu-> mount-> dd xx -> umout -> mount -> rm xx, then the error may
> happen, no need to reboot.
>
> ramdisk can not cause error just because it just malloc and memcpy,
> while not going to blk layer.
>
> Also tried creating one file in /tmp, used as fileio, also can reproduce.
> So no real device is based.
>
> like:
> cd /tmp
> dd if=/dev/zero of=test bs=1M count=1024; sync;
> targetcli
> #targetcli
> (targetcli) /> cd backstores/fileio
> (targetcli) /> create name=file_backend file_or_dev=/tmp/test size=1G
> (targetcli) /> cd /vhost
> (targetcli) /> create wwn=naa.60014052cc816bf4
> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
> (targetcli) /> create /backstores/fileio/file_backend
> (targetcli) /> cd /
> (targetcli) /> saveconfig
> (targetcli) /> exit
>
> /work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
>     -enable-kvm -nographic -kernel Image \
>     -device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
>     -m 512 -M virt -cpu host \
>     -append "earlyprintk console=ttyAMA0 mem=512M"
>
> in qemu:
> mkfs.ext4 /dev/sda
> mount /dev/sda /mnt/
> sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;
>
> using dd test, then some error happen.
> log like:
> oot@(none)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;; date;
> [ 1789.917963] sbc_parse_cdb cdb[0]=0x35
> [ 1789.922000] fd_execute_sync_cache immed=0
> Tue Jul 19 07:26:12 UTC 2016
> [  200.712879] EXT4-fs error (device sda) [ 1790.191770] sbc_parse_cdb
> cdb[0]=0x2a
> in ext4_reserve_inode_write:5362[ 1790.198382]  fd_execute_rw
> : Corrupt filesystem
> [  200.729001] EXT4-fs error (device sda) [ 1790.207843] sbc_parse_cdb
> cdb[0]=0x2a
> in ext4_reserve_inode_write:5362[ 1790.214495]  fd_execute_rw
> : Corrupt filesystem
>
> Looks like the error usually happens after SYCHRONIZE CACHE, but not
> for sure it is always happen after sync cache.
>
It is not always happen after SYCHRONIZE CACHE

Just tried in qemu: mount-> dd xx -> umount -> mount -> rm xx
ram based, (/tmp/test), no reboot.

root@(none)$ cd /mnt
root@(none)$ ls
[  301.444966]  sbc_parse_cdb cdb[0]=0x28
[  301.449003]  fd_execute_rw
lost+found  test
root@(none)$ rm test
[  304.281920]  sbc_parse_cdb cdb[0]=0x28
[  304.285955]  fd_execute_rw
[  118.002338] EXT4-fs error (device sda):[  304.290685] gzf sbc_parse_cdb cdb[0
]=0x28
 ext4_mb_generate_buddy:758: gro[  304.296737] gzf fd_execute_rw
up 3, block bitmap and bg descri[  304.304099]  sbc_parse_cdb cdb[0]=0x28
ptor inconsistent: 21504 vs 2143[  304.309322]  fd_execute_rw
9 free clusters
[  118.015903] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). The
re's a risk of filesystem corruption in case of system crash.
root@(none)$

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html