linux-ext4 - Re: [Bugme-new] [Bug 11266] New: unable to handle kernel paging request in ext2_free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080820102533.GA5979@atrey.karlin.mff.cuni.cz>
Date:	Wed, 20 Aug 2008 12:25:33 +0200
From:	Jan Kara <jack@...e.cz>
To:	Sami Liedes <sliedes@...hut.fi>
Cc:	Andreas Dilger <adilger@....com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	bugme-daemon@...zilla.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [Bugme-new] [Bug 11266] New: unable to handle kernel paging request in ext2_free_blocks

> On Tue, Aug 19, 2008 at 11:13:39AM +0200, Jan Kara wrote:
> > > Isn't equivalent checking done in ext2_check_descriptors()?  It would make
> > > sense to abstract out the "check one group and return error" code and use
> > > it in both places.
> >   Actually yes, it is. Good point. Sami, is it the case that you have
> > mounted the filesystem, then intentionally corrupted it and after that
> > the kernel oopsed (as opposed to first corrupting the filesystem image and
> > mounting it after that)? That would explain how corrupted values could get
> > to read_block_bitmap() even though ext2_check_descriptors() checked them.
> 
> No, that's not what I do. I corrupt the fs before mounting it, then
> mount it, perform normal filesystem operations on it and unmount it.
  OK, thanks. Then we must somehow corrupt group descriptor block during
the operation. Because I'm pretty sure it *is* corrupted - the oops
is: unable to handle kernel paging request at c7e95ffc. If we look into
registers, we see ECX has c7e96000 (which is probably bh->b_data). In
the second oops it's exactly the same - ECX has c11e4000, the oops is at
address c11e3ffc. So in both cases it is ECX-4. So somehow we managed to
pass negative offset into ext2_test_bit(). But as Andreas pointed out,
when we load descriptors into memory, we check that both bitmaps and
inode table is in ext2_check_descriptors()... The other possibility
would be that we managed to corrupts s_first_data_block in the
superblock. Anyway, both possibilities don't look very likely. I'll try
to reproduce the problem and maybe get more insight... How large is your
filesystem BTW?

> Here's the most current script I use (zzuf is the fuzzer):
> 
> ------------------------------------------------------------
> #!/bin/sh
> 
> if [ "`hostname`" != "fstest" ]; then
>    echo "This is a dangerous script."
>    echo "Set your hostname to \`fstest\' if you want to use it."
>    exit 1
> fi
> 
> umount /dev/hdb
> umount /dev/hdc
> /etc/init.d/sysklogd stop
> /etc/init.d/klogd stop
> /etc/init.d/cron stop
> mount /dev/hda / -t ext3 -o remount,ro || exit 1
> 
> #ulimit -t 20
> 
> for ((s=$1; s<1000000000; s++)); do
>   umount /mnt
>   echo '***** zzuffing *****' seed $s
>   zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
>   mount /dev/hdb /mnt -t ext2 -o errors=continue || continue
>   cd /mnt || continue
>   timeout 30 cp -r doc doc2 >&/dev/null
>   timeout 30 find -xdev >&/dev/null
>   timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
>   timeout 30 mkdir tmp >&/dev/null
>   timeout 30 echo whoah >tmp/filu 2>/dev/null
>   timeout 30 rm -rf /mnt/* >&/dev/null
>   cd /
> done
> ------------------------------------------------------------

								Honza
-- 
Jan Kara <jack@...e.cz>
SuSE CR Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html