linux-kernel - Re: [PATCH 5/5] writeback: introduce writeback_control.more

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071004022133.GA6244@mail.ustc.edu.cn>
Date:	Thu, 4 Oct 2007 10:21:33 +0800
From:	Fengguang Wu <wfg@...l.ustc.edu.cn>
To:	David Chinner <dgc@....com>
Cc:	Andrew Morton <akpm@...l.org>, linux-kernel@...r.kernel.org,
	Ken Chen <kenchen@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michael Rubin <mrubin@...gle.com>
Subject: Re: [PATCH 5/5] writeback: introduce writeback_control.more_io to
	indicate more io

On Wed, Oct 03, 2007 at 12:41:19PM +1000, David Chinner wrote:
> On Wed, Oct 03, 2007 at 09:34:39AM +0800, Fengguang Wu wrote:
> > On Wed, Oct 03, 2007 at 07:47:45AM +1000, David Chinner wrote:
> > > On Tue, Oct 02, 2007 at 04:41:48PM +0800, Fengguang Wu wrote:
> > > >  		wbc.pages_skipped = 0;
> > > > @@ -560,8 +561,9 @@ static void background_writeout(unsigned
> > > >  		min_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
> > > >  		if (wbc.nr_to_write > 0 || wbc.pages_skipped > 0) {
> > > >  			/* Wrote less than expected */
> > > > -			congestion_wait(WRITE, HZ/10);
> > > > -			if (!wbc.encountered_congestion)
> > > > +			if (wbc.encountered_congestion || wbc.more_io)
> > > > +				congestion_wait(WRITE, HZ/10);
> > > > +			else
> > > >  				break;
> > > >  		}
> > > 
> > > Why do you call congestion_wait() if there is more I/O to issue?  If
> > > we have a fast filesystem, this might cause the device queues to
> > > fill, then drain on congestion_wait(), then fill again, etc. i.e. we
> > > will have trouble keeping the queues full, right?
> > 
> > You mean slow writers and fast RAID? That would be exactly the case
> > these patches try to improve.
> 
> I mean any writers and a fast block device (raid or otherwise).
> 
> > This patchset makes kupdate/background writeback more responsible,
> > so that if (avg-write-speed < device-capabilities), the dirty data are
> > synced timely, and we don't have to go for balance_dirty_pages().
> 
> Sure, but I'm asking about the effect of the patches on the
> (avg-write-speed == device-capabilities) case. I agree that
> they are necessary for timely syncing of data but I'm trying
> to understand what effect they have on the normal write case

> (i.e. keeping the disk at full write throughput).

OK, I guess it is the focus of all your questions: Why should we sleep
in congestion_wait() and possibly hurt the write throughput? I'll try
to summary it:

- congestion_wait() is necessary
Besides device congestions, there may be other blockades we have to
wait on, e.g. temporary page locks, NFS/journal issues(I guess).

- congestion_wait() is called only when necessary
congestion_wait() will only be called we saw blockades:
        if (wbc.nr_to_write > 0 || wbc.pages_skipped > 0) {
                congestion_wait(WRITE, HZ/10);
        }
So in normal case, it may well write 128MB data without any waiting.

- congestion_wait() won't hurt write throughput
When not congested, congestion_wait() will be wake up on each write
completion. Note that MAX_WRITEBACK_PAGES=1024 and
/sys/block/sda/queue/max_sectors_kb=512(for me),
which means we are gave the chance to sync 4MB on every 512KB written,
which means we are able to submit write IOs 8 times faster than the
device capability. congestion_wait() is a magical timer :-)

> > So for your question of queue depth, the answer is: the queue length
> > will not build up in the first place. 
> 
> Which queue are you talking about here? The block deivce queue?

Yes, the elevator's queues.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/