lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 17 Aug 2015 10:50:56 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Dave Chinner <david@...morbit.com>
Cc:	Ext4 Developers List <linux-ext4@...r.kernel.org>,
	Eryu Guan <eguan@...hat.com>
Subject: Re: [PATCH] ext4: ratelimit the file system mounted message

On Mon, Aug 17, 2015 at 11:12:15AM +1000, Dave Chinner wrote:
> On Sat, Aug 15, 2015 at 02:59:57PM -0400, Theodore Ts'o wrote:
> > The xfstests ext4/305 will mount and unmount the same file system over
> > 4,000 times, and each one of these will cause a system log message.
> > Ratelimit this message since if we are getting more than a few dozen
> > of these messages, they probably aren't going to be helpful.
> 
> Perhaps you should look at fixing the test or making it a more
> targetted reproducer. Tests that do "loop doing something basic
> while looping doing something else basic for X minutes to try to
> trip a race condition" aren't very good regression tests....

The problem what we are specifically testing is a race where one
process is reading from a proc fs file while the file system is being
unmounted:

commit f7922730727844c6dee837bd1a64221342fef1d1
Author: Eryu Guan <eguan@...hat.com>
Date:   Mon Apr 1 10:57:43 2013 +0000

    xfstests ext4 305: test read /proc/fs/ext4/<dev>/mb_groups while the fs is being unmounted
    
    Regression test for commit:
    9559996 ext4: remove mb_groups before tearing down the buddy_cache
    
    Signed-off-by: Eryu Guan <eguan@...hat.com>
    Reviewed-by: Rich Johnston <rjohnston@....com>
    [rjohnston@....com renumbered test to next in group sequence]
    Signed-off-by: Rich Johnston <rjohnston@....com>

I don't see a better way of doing the test off the top of my head,
though.... and to be honest I'm not sure how much value the test
really has, since it's the sort of thing that can easily be checked by
inspection, and it seems rather unlikely we would regress here.

BTW, out of curiosity I reverted 9559996 and tried running ext4/305
many times, on a variety of different VM's ranging from 1 to 8 CPU's,
and using both a SSD and a laptop HDD.

In all cases, ext3/305 reliably reproduced the failure within 30
mount/unmount cycles, and in most cases, under a dozen cycles.  (i.e.,
under two seconds, and usually in a fraction of a second).  So I'm not
entirely sure why the test was written to run the loop for 3 minutes
and thousands of mount/unmount cycles.

Eryu, you wrote the test; any thoughts?  At the very least I'd suggest
cutting the test down so that it runs for at most 2 seconds, if for no
other reason than to speed up regression test runs.

						- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ