[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.00.1107130350570.4184@sister.anvils>
Date: Wed, 13 Jul 2011 03:57:40 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Wu Fengguang <fengguang.wu@...el.com>
cc: Andrew Morton <akpm@...ux-foundation.org>, Jan Kara <jack@...e.cz>,
Mel Gorman <mel@....ul.ie>, Dave Chinner <david@...morbit.com>,
Christoph Hellwig <hch@...radead.org>,
Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 12/15] writeback: remove writeback_control.more_io
On Tue, 12 Jul 2011, Hugh Dickins wrote:
> On Tue, 12 Jul 2011, Hugh Dickins wrote:
> > On Mon, 11 Jul 2011, Wu Fengguang wrote:
> > >
> > > It's relatively easy to confirm, by reusing the below trace event to
> > > show the inode (together with its state) being requeued.
> > >
> > > If this is the root cause, it may equally be fixed by
> > >
> > > - requeue_io(inode, wb);
> > > + redirty_tail(inode, wb);
> > >
> > > which would be useful in case the bug is so deadly that it's no longer
> > > possible to do tracing.
> >
> > I checked again this morning that I could reproduce it on two machines,
> > one went in a few minutes, the other within the hour. Then I made that
> > patch changing the requeue_io to redirty_tail, and left home with them
> > running the test with the new kernel: we'll see at the end of the day
> > how they fared.
>
> I think that fixes it. The x86_64 is still running with that, but the
> ppc64 gave up fairly early, hitting freeze in __slab_free() instead.
>
> I've now, I believe, reconstituted what ChristophL intended from the
> mm_types.h struct page patch he posted (which applied neither to mmotm,
> nor to Pekka's for-next, so far as I could tell: maybe cl did some
> intermediate tidying of some of the random indentation). So now
> testing that with redirty_tail on ppc64: will report in 9 hours.
Same result as before. The x86_64 is still going fine, but the ppc64
again seized up in __slab_free() after two and a half hours of load.
I think we should assume that your -requeue_io +redirty_tail is a good
fix for the writeback freeze (if you can reassure us, that it does not
risk postponing some writes indefinitely), and I move over to the other
thread to pursue the struct page __slab_free() freeze.
Thanks!
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists