[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150817145056.GC27202@thunk.org>
Date: Mon, 17 Aug 2015 10:50:56 -0400
From: Theodore Ts'o <tytso@....edu>
To: Dave Chinner <david@...morbit.com>
Cc: Ext4 Developers List <linux-ext4@...r.kernel.org>,
Eryu Guan <eguan@...hat.com>
Subject: Re: [PATCH] ext4: ratelimit the file system mounted message
On Mon, Aug 17, 2015 at 11:12:15AM +1000, Dave Chinner wrote:
> On Sat, Aug 15, 2015 at 02:59:57PM -0400, Theodore Ts'o wrote:
> > The xfstests ext4/305 will mount and unmount the same file system over
> > 4,000 times, and each one of these will cause a system log message.
> > Ratelimit this message since if we are getting more than a few dozen
> > of these messages, they probably aren't going to be helpful.
>
> Perhaps you should look at fixing the test or making it a more
> targetted reproducer. Tests that do "loop doing something basic
> while looping doing something else basic for X minutes to try to
> trip a race condition" aren't very good regression tests....
The problem what we are specifically testing is a race where one
process is reading from a proc fs file while the file system is being
unmounted:
commit f7922730727844c6dee837bd1a64221342fef1d1
Author: Eryu Guan <eguan@...hat.com>
Date: Mon Apr 1 10:57:43 2013 +0000
xfstests ext4 305: test read /proc/fs/ext4/<dev>/mb_groups while the fs is being unmounted
Regression test for commit:
9559996 ext4: remove mb_groups before tearing down the buddy_cache
Signed-off-by: Eryu Guan <eguan@...hat.com>
Reviewed-by: Rich Johnston <rjohnston@....com>
[rjohnston@....com renumbered test to next in group sequence]
Signed-off-by: Rich Johnston <rjohnston@....com>
I don't see a better way of doing the test off the top of my head,
though.... and to be honest I'm not sure how much value the test
really has, since it's the sort of thing that can easily be checked by
inspection, and it seems rather unlikely we would regress here.
BTW, out of curiosity I reverted 9559996 and tried running ext4/305
many times, on a variety of different VM's ranging from 1 to 8 CPU's,
and using both a SSD and a laptop HDD.
In all cases, ext3/305 reliably reproduced the failure within 30
mount/unmount cycles, and in most cases, under a dozen cycles. (i.e.,
under two seconds, and usually in a fraction of a second). So I'm not
entirely sure why the test was written to run the loop for 3 minutes
and thousands of mount/unmount cycles.
Eryu, you wrote the test; any thoughts? At the very least I'd suggest
cutting the test down so that it runs for at most 2 seconds, if for no
other reason than to speed up regression test runs.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists