linux-kernel - Re: NULL pointer dereference in ext4_ext_remove

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANP3RGd62=voh5T6NACyFE-NqX=Huk1hkewSakPs67vC+uuTuw@mail.gmail.com>
Date:	Thu, 16 Aug 2012 14:40:53 -0700
From:	Maciej Żenczykowski <maze@...gle.com>
To:	"Theodore Ts'o" <tytso@....edu>,
	Maciej Żenczykowski <maze@...gle.com>,
	Fengguang Wu <fengguang.wu@...el.com>,
	Marti Raudsepp <marti@...fo.org>,
	Kernel hackers <linux-kernel@...r.kernel.org>,
	ext4 hackers <linux-ext4@...r.kernel.org>
Subject: Re: NULL pointer dereference in ext4_ext_remove_space on 3.5.1

> Maciej, you weren't able to reliably repro the crash were you?  I'm
> pretty sure this should fix the crash, but it would be really great to
> confirm things.
>
> I suspect creating a file system with a really small journal may make
> it easier to reproduce, but I haven't had time to try create a
> reliable repro for this bug yet.

This happened twice to me while moving data off of a ~1TB ext4 partition.
The data portion was on a stripe raid across 2 ~500GB drives, the
journal was on a relatively large partition (500MB?) on an SSD.
(crypto and lvm were also involved).
I've since emptied the partition and deleted even the raid array.

Both times it happened during rm, first time rm -rf of a directory
tree, second time during rm of a 250GB disk image generated by dd
(from a notebook drive).
Both rm's were manually run by me from a shell command line, and there
was pretty much nothing else happening on the machine at the time.

I'm not aware of there having been anything interesting (like:
holes/punch/sparseness, much r/w activity in the middle of files, etc)
on this filesystem, it was pretty much just a write-once data backup
that I had copied elsewhere and was deleting.  The 250GB disk image
was definitely just a sequentially written disk dump, and I think the
same thing holds true for the contents of the wiped directory tree
(although in many much smaller files).

I know i=1 in both cases (and dissasembly pointed out the location
where the above debug patch is BUGing), but I don't think it's
possible to figure out what inode # it crashed on.

Perhaps just untarring a bunch of kernels onto an empty partition,
filling it up, then deleting those kernels should be sufficient to
repro this (untried).

Perhaps something like:
  create 1TB filesystem
  untar a thousand kernel source trees on to it
  create 20GB files of junk until it is full
  rm -rf /

- Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/