linux-kernel - Re: Error testing ext3 on brd ramdisk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49B0D514.2020804@nokia.com>
Date:	Fri, 06 Mar 2009 09:47:32 +0200
From:	Adrian Hunter <adrian.hunter@...ia.com>
To:	Nick Piggin <npiggin@...e.de>
CC:	"Jorge Boncompte [DTI2]" <jorge@...2.net>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: Error testing ext3 on brd ramdisk

Nick Piggin wrote:
> On Mon, Mar 02, 2009 at 06:42:18PM +0100, Jorge Boncompte [DTI2] wrote:
>> Nick Piggin escribió:
>>> On Fri, Feb 27, 2009 at 07:08:46PM +0100, Jorge Boncompte [DTI2] wrote:
>>>> 	Hi,
>>>>
>>>> 	I have added Nick Piggin to the CC: as maintainer of the brd driver.
>>>>
>>>> 	After switching an embedded distribution that /etc on a ramdisk 
>>>> 	based minix filesystem from 2.6.23.17 to 2.6.29-rcX i am too getting 
>>>> 	errors ant the filesystem is corrupted. Does not happen always. The 
>>>> visible effect with text files after reboot is getting the old version of 
>>>> the file and "\0"'s at the end.
>>>>
>>>> 	Did you found a solution?
>>> What architectures are you using? It's possible that brd is missing
>>> a cacheflush. I test it pretty heavily on x86 and no problems, so
>>> this might point to an arch specific problem.
>>>
>>> ---
>>> drivers/block/brd.c |    4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> Index: linux-2.6/drivers/block/brd.c
>>> ===================================================================
>>> --- linux-2.6.orig/drivers/block/brd.c
>>> +++ linux-2.6/drivers/block/brd.c
>>> @@ -275,8 +275,10 @@ static int brd_do_bvec(struct brd_device
>>> 	if (rw == READ) {
>>> 		copy_from_brd(mem + off, brd, sector, len);
>>> 		flush_dcache_page(page);
>>> -	} else
>>> +	} else {
>>> +		flush_dcache_page(page);
>>> 		copy_to_brd(brd, mem + off, sector, len);
>>> +	}
>>> 	kunmap_atomic(mem, KM_USER0);
>>>
>>> out:
>> 	Hi, I am on 32bits x86, 2 x Xeon with HT CPUs, but I have seen the 
>> 	same corruption on a KVM/QEMU guest with single emulated CPU.
>>
>> 	With your patch on top of vanilla 2.6.29-rc3+plus some networking 
>> patches I still get corruption sometimes.
>>
>> 	The script that saves the configuration does...
>>
>> ------------
>> mount -no remount,ro /dev/ram0
>> dd if=/dev/ram0 of=config.bin bs=1k count=1000
>> mount -no remount,rw /dev/ram0
>> md5sum config.bin
>> dd if=config.bin of=/dev/hda1
>> echo $md5sum | dd of=/dev/hda1 bs=1k seek=1100 count=32
>> ------------
>>
>> on system boot
>>
>> ------------
>> CHECK MD5SUM
>> dd if=/dev/hda1 of=/dev/ram0 bs=1k count=1000
>> fsck.minix -a /dev/ram0
>> mount -nt minix /dev/ram0 /etc -o rw
>> ------------
>>
>> 	I have never seen a MD5 failure on boot, just sometimes the 
>> 	filesystem is corrupted. Kernel config attached.
> 
> Hi Jorge,
> 
> Well I found and fixed something :) (see other mail) but I don't know
> whether that applies to you here if you're running with a single CPU
> and no preemption. But still, it might be worth trying that patch? I'm
> sorry I'm still unable to reproduce a problem with your script
> (although you don't describe how you create the filesystem before
> you remount it).
> 
>>>From your description, it suggests that the corrupted image is being
> read from /dev/ram0 (becuase the md5sum passes).
> 
> In your script, can you run fsck.minix on config.bin when you first
> create it? What if you unmount /dev/ram0 before copying the image?
> 
> Thanks,
> Nick

Thanks for looking at this.

I applied both patches and still got:

-------------------------------------------------------------
Cycle 616
Thu Mar  5 22:13:16 EET 2009
Mounting
kjournald starting.  Commit interval 5 seconds
EXT3 FS on ram0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Removing old fsstress data
Starting fsstress
Sleeping 30 seconds
seed = 1237038794
Stopping fsstress
18670 ttyS0    00:00:00 fsstress
18672 ttyS0    00:00:15 fsstress
18673 ttyS0    00:00:15 fsstress
18674 ttyS0    00:00:15 fsstress
./brd-test.sh: line 30: 18670 Terminated              ./fsstress/fsstress -d /mnt/test_file_system/work -p 3 -l 0 -n 100000000
Unmounting
Checking
/dev/ram0: HTREE directory inode 46 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram0: Entry 'f6c' in /work/p1/d0 (46) has deleted/unused inode 261.  CLEARED.
/dev/ram0: Entry 'f276' in /work/p1/d0 (46) has deleted/unused inode 454.  CLEARED.
/dev/ram0: Entry 'f152' in /work/p1/d0 (46) has deleted/unused inode 543.  CLEARED.
/dev/ram0: Entry 'cc1' in /work/p1/d0 (46) has an incorrect filetype (was 3, should be 2).


/dev/ram0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)



My test box is Pentium D dual core.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/