lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 8 Oct 2009 13:33:35 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Peter Staubach <staubach@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Theodore Tso <tytso@....edu>,
	Christoph Hellwig <hch@...radead.org>,
	Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	Myklebust Trond <Trond.Myklebust@...app.com>,
	"jens.axboe@...cle.com" <jens.axboe@...cle.com>,
	Jan Kara <jack@...e.cz>, Nick Piggin <npiggin@...e.de>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/45] some writeback experiments

On Wed, Oct 07, 2009 at 11:18:22PM +0800, Wu Fengguang wrote:
> On Wed, Oct 07, 2009 at 09:47:14PM +0800, Peter Staubach wrote:
> > 
> > > # vmmon -d 1 nr_writeback nr_dirty nr_unstable      # (per 1-second samples)
> > >      nr_writeback         nr_dirty      nr_unstable
> > >             11227            41463            38044
> > >             11227            41463            38044
> > >             11227            41463            38044
> > >             11227            41463            38044
> 
> I guess in the above 4 seconds, either client or (more likely) server
> is blocked. A blocked server cannot send ACKs to knock down both

Yeah the server side is blocked.  The nfsd are mostly blocked in
generic_file_aio_write(), in particular, the i_mutex lock! I'm copying
one or two big files over NFS, so the i_mutex lock is heavily contented.

I'm using the default wsize=4096 for NFS-root..

wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
  329   329 TS       -  -5  24   1  0.0 S<   worker_thread            nfsiod
 4690  4690 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4691  4691 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
 4692  4692 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
 4693  4693 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
 4694  4694 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4695  4695 TS       -  -5  24   1  0.1 D<   generic_file_aio_write   nfsd
 4696  4696 TS       -  -5  24   1  0.0 D<   log_wait_commit          nfsd
 4697  4697 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
  329   329 TS       -  -5  24   1  0.0 S<   worker_thread            nfsiod
 4690  4690 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4691  4691 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
 4692  4692 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd
 4693  4693 TS       -  -5  24   0  0.0 D<   sync_buffer              nfsd
 4694  4694 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4695  4695 TS       -  -5  24   1  0.1 D<   generic_file_aio_write   nfsd
 4696  4696 TS       -  -5  24   1  0.0 D<   generic_file_aio_write   nfsd
 4697  4697 TS       -  -5  24   0  0.0 D<   generic_file_aio_write   nfsd

wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
  329   329 TS       -  -5  24   1  0.0 S<   worker_thread            nfsiod
 4690  4690 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4691  4691 TS       -  -5  24   0  0.1 D<   get_request_wait         nfsd
 4692  4692 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4693  4693 TS       -  -5  24   0  0.1 S<   svc_recv                 nfsd
 4694  4694 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4695  4695 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4696  4696 TS       -  -5  24   0  0.1 S<   svc_recv                 nfsd
 4697  4697 TS       -  -5  24   1  0.1 D<   generic_file_aio_write   nfsd

wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs
  329   329 TS       -  -5  24   1  0.0 S<   worker_thread            nfsiod
 4690  4690 TS       -  -5  24   1  0.1 D<   get_write_access         nfsd
 4691  4691 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4692  4692 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4693  4693 TS       -  -5  24   1  0.1 D<   generic_file_aio_write   nfsd
 4694  4694 TS       -  -5  24   1  0.1 D<   get_write_access         nfsd
 4695  4695 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4696  4696 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd
 4697  4697 TS       -  -5  24   0  0.1 D<   generic_file_aio_write   nfsd

Thanks,
Fengguang

> nr_writeback/nr_unstable. And the stuck nr_writeback will freeze
> nr_dirty as well, because the dirtying process is throttled until
> it receives enough "PG_writeback cleared" event, however the bdi-flush
> thread is also blocked when trying to clear more PG_writeback, because
> the client side nr_writeback limit has been reached. In summary,
> 
> server blocked => nr_writeback stuck => nr_writeback limit reached
> => bdi-flush blocked => no end_page_writeback() => dirtier blocked
> => nr_dirty stuck
> 
> Thanks,
> Fengguang
> 
> > >             11045            53987             6490
> > >             11033            53120             8145
> > >             11195            52143            10886
> > >             11211            52144            10913
> > >             11211            52144            10913
> > >             11211            52144            10913
> > > 
> > > btrfs seems to maintain a private pool of writeback pages, which can go out of
> > > control:
> > > 
> > >      nr_writeback         nr_dirty
> > >            261075              132
> > >            252891              195
> > >            244795              187
> > >            236851              187
> > >            228830              187
> > >            221040              218
> > >            212674              237
> > >            204981              237
> > > 
> > > XFS has very interesting "bumpy writeback" behavior: it tends to wait
> > > collect enough pages and then write the whole world.
> > > 
> > >      nr_writeback         nr_dirty
> > >             80781                0
> > >             37117            37703
> > >             37117            43933
> > >             81044                6
> > >             81050                0
> > >             43943            10199
> > >             43930            36355
> > >             43930            36355
> > >             80293                0
> > >             80285                0
> > >             80285                0
> > > 
> > > Thanks,
> > > Fengguang
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majordomo@...r.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ