lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120817060110.GA28786@localhost>
Date:	Fri, 17 Aug 2012 14:01:10 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Theodore Ts'o <tytso@....edu>, Marti Raudsepp <marti@...fo.org>,
	Kernel hackers <linux-kernel@...r.kernel.org>,
	ext4 hackers <linux-ext4@...r.kernel.org>, maze@...gle.com
Subject: Re: NULL pointer dereference in ext4_ext_remove_space on 3.5.1

On Thu, Aug 16, 2012 at 11:25:13AM -0400, Theodore Ts'o wrote:
> On Thu, Aug 16, 2012 at 07:10:51PM +0800, Fengguang Wu wrote:
> > 
> > Here is the dmesg. BTW, it seems 3.5.0 don't have this issue.
> 
> Fengguang,
> 
> It sounds like you have a (at least fairly) reliable reproduction for
> this problem?  Is it something you can share?  It would be good to get

Right, it can be easily reproduced here. I'm running these writeback
performance tests:

        https://github.com/fengguang/writeback-tests

Which is basically doing N parallel dd writes to JBOD/RAID arrays on
various filesystems. It seems that the RAID test can reliably trigger
the problem.

> this into our test suites, since it was _not_ something that was
> caught by xfstests, apparently.
> 
> Can you see if this patch addresses it?  (The first two patch hunks
> are the same debugging additions I had posted before.)
> 
> It looks like the responsible commit is 968dee7722: "ext4: fix hole
> punch failure when depth is greater than 0".  I had thought this patch
> was low risk if you weren't using the new punch ioctl, but it turns
> out it did make a critical change in the non-punch (i.e., truncate)
> code path, which is what the addition of "i = 0;" in the patch below
> addresses.

Yes, I'm sure the patch fixed the bug. With the fix, the writeback
tests have run flawlessly for a dozen hours without any problem.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ