[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091008053335.GA19458@localhost>
Date: Thu, 8 Oct 2009 13:33:35 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Peter Staubach <staubach@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Theodore Tso <tytso@....edu>,
Christoph Hellwig <hch@...radead.org>,
Dave Chinner <david@...morbit.com>,
Chris Mason <chris.mason@...cle.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"Li, Shaohua" <shaohua.li@...el.com>,
Myklebust Trond <Trond.Myklebust@...app.com>,
"jens.axboe@...cle.com" <jens.axboe@...cle.com>,
Jan Kara <jack@...e.cz>, Nick Piggin <npiggin@...e.de>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/45] some writeback experiments
On Wed, Oct 07, 2009 at 11:18:22PM +0800, Wu Fengguang wrote:
> On Wed, Oct 07, 2009 at 09:47:14PM +0800, Peter Staubach wrote:
> >
> > > # vmmon -d 1 nr_writeback nr_dirty nr_unstable # (per 1-second samples)
> > > nr_writeback nr_dirty nr_unstable
> > > 11227 41463 38044
> > > 11227 41463 38044
> > > 11227 41463 38044
> > > 11227 41463 38044
>
> I guess in the above 4 seconds, either client or (more likely) server
> is blocked. A blocked server cannot send ACKs to knock down both
Yeah the server side is blocked. The nfsd are mostly blocked in
generic_file_aio_write(), in particular, the i_mutex lock! I'm copying
one or two big files over NFS, so the i_mutex lock is heavily contented.
I'm using the default wsize=4096 for NFS-root..
wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod
4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4691 4691 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
4692 4692 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
4693 4693 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4695 4695 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd
4696 4696 TS - -5 24 1 0.0 D< log_wait_commit nfsd
4697 4697 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod
4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4691 4691 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
4692 4692 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
4693 4693 TS - -5 24 0 0.0 D< sync_buffer nfsd
4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4695 4695 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd
4696 4696 TS - -5 24 1 0.0 D< generic_file_aio_write nfsd
4697 4697 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd
wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod
4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4691 4691 TS - -5 24 0 0.1 D< get_request_wait nfsd
4692 4692 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4693 4693 TS - -5 24 0 0.1 S< svc_recv nfsd
4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4695 4695 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4696 4696 TS - -5 24 0 0.1 S< svc_recv nfsd
4697 4697 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd
wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod
4690 4690 TS - -5 24 1 0.1 D< get_write_access nfsd
4691 4691 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4692 4692 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4693 4693 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd
4694 4694 TS - -5 24 1 0.1 D< get_write_access nfsd
4695 4695 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4696 4696 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
4697 4697 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd
Thanks,
Fengguang
> nr_writeback/nr_unstable. And the stuck nr_writeback will freeze
> nr_dirty as well, because the dirtying process is throttled until
> it receives enough "PG_writeback cleared" event, however the bdi-flush
> thread is also blocked when trying to clear more PG_writeback, because
> the client side nr_writeback limit has been reached. In summary,
>
> server blocked => nr_writeback stuck => nr_writeback limit reached
> => bdi-flush blocked => no end_page_writeback() => dirtier blocked
> => nr_dirty stuck
>
> Thanks,
> Fengguang
>
> > > 11045 53987 6490
> > > 11033 53120 8145
> > > 11195 52143 10886
> > > 11211 52144 10913
> > > 11211 52144 10913
> > > 11211 52144 10913
> > >
> > > btrfs seems to maintain a private pool of writeback pages, which can go out of
> > > control:
> > >
> > > nr_writeback nr_dirty
> > > 261075 132
> > > 252891 195
> > > 244795 187
> > > 236851 187
> > > 228830 187
> > > 221040 218
> > > 212674 237
> > > 204981 237
> > >
> > > XFS has very interesting "bumpy writeback" behavior: it tends to wait
> > > collect enough pages and then write the whole world.
> > >
> > > nr_writeback nr_dirty
> > > 80781 0
> > > 37117 37703
> > > 37117 43933
> > > 81044 6
> > > 81050 0
> > > 43943 10199
> > > 43930 36355
> > > 43930 36355
> > > 80293 0
> > > 80285 0
> > > 80285 0
> > >
> > > Thanks,
> > > Fengguang
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majordomo@...r.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists