linux-kernel - Re: [PATCH 01/45] writeback: reduce calls to global_page_state in balance_dirty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1255458499.8967.711.camel@laptop>
Date:	Tue, 13 Oct 2009 20:28:19 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Jan Kara <jack@...e.cz>
Cc:	Wu Fengguang <fengguang.wu@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Theodore Tso <tytso@....edu>,
	Christoph Hellwig <hch@...radead.org>,
	Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	Myklebust Trond <Trond.Myklebust@...app.com>,
	"jens.axboe@...cle.com" <jens.axboe@...cle.com>,
	Nick Piggin <npiggin@...e.de>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Richard Kennedy <richard@....demon.co.uk>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 01/45] writeback: reduce calls to global_page_state in
 balance_dirty_pages()

On Tue, 2009-10-13 at 20:12 +0200, Jan Kara wrote:
> >       for (;;) {
> >               nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
> >                                global_page_state(NR_UNSTABLE_NFS);
> >               nr_writeback = global_page_state(NR_WRITEBACK) +
> >                              global_page_state(NR_WRITEBACK_TEMP);
> > 
> >               global_dirty_thresh(&background_thresh, &dirty_thresh);
> > 
> >               /*
> >                * Throttle it only when the background writeback cannot
> >                * catch-up. This avoids (excessively) small writeouts
> >                * when the bdi limits are ramping up.
> >                */
> >               if (nr_reclaimable + nr_writeback <
> >                   (background_thresh + dirty_thresh) / 2)
> >                       break;
> > 
> >               bdi_thresh = bdi_dirty_thresh(bdi, dirty_thresh);
> > 
> >               /*
> >                * In order to avoid the stacked BDI deadlock we need
> >                * to ensure we accurately count the 'dirty' pages when
> >                * the threshold is low.
> >                *
> >                * Otherwise it would be possible to get thresh+n pages
> >                * reported dirty, even though there are thresh-m pages
> >                * actually dirty; with m+n sitting in the percpu
> >                * deltas.
> >                */
> >               if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> >                       bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> >                       bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> >               } else {
> >                       bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
> >                       bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> >               }
> > 
> >               /*
> >                * The bdi thresh is somehow "soft" limit derived from the
> >                * global "hard" limit. The former helps to prevent heavy IO
> >                * bdi or process from holding back light ones; The latter is
> >                * the last resort safeguard.
> >                */
> >               dirty_exceeded =
> >                       (bdi_nr_reclaimable + bdi_nr_writeback >= bdi_thresh)
> >                       || (nr_reclaimable + nr_writeback >= dirty_thresh);
> > 
> >               if (!dirty_exceeded)
> >                       break;
> > 
> >               bdi->dirty_exceed_time = jiffies;
> > 
> >               bdi_writeback_wait(bdi, write_chunk);
>   Hmm, probably you've discussed this in some other email but why do we
> cycle in this loop until we get below dirty limit? We used to leave the
> loop after writing write_chunk... So the time we spend in
> balance_dirty_pages() is no longer limited, right?

Wu was saying that without the loop nr_writeback wasn't limited, but
since bdi_writeback_wakeup() is driven from writeout completion, I'm not
sure how again that was so.

We can move all of bdi_dirty to bdi_writeout, if the bdi writeout queue
permits, but it cannot grow beyond the total limit, since we're actually
waiting for writeout completion.

Possibly unstable is peculiar.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/