linux-kernel - Re: regression in page writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090925120608.GA15216@think>
Date:	Fri, 25 Sep 2009 08:06:08 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	Dave Chinner <david@...morbit.com>
Cc:	Wu Fengguang <fengguang.wu@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"richard@....demon.co.uk" <richard@....demon.co.uk>,
	"jens.axboe@...cle.com" <jens.axboe@...cle.com>
Subject: Re: regression in page writeback

On Fri, Sep 25, 2009 at 03:04:13PM +1000, Dave Chinner wrote:
> On Thu, Sep 24, 2009 at 08:38:20PM -0400, Chris Mason wrote:
> > On Fri, Sep 25, 2009 at 10:11:17AM +1000, Dave Chinner wrote:
> > > On Thu, Sep 24, 2009 at 11:15:08AM +0800, Wu Fengguang wrote:
> > > > On Wed, Sep 23, 2009 at 10:00:58PM +0800, Chris Mason wrote:
> > > > > The only place that actually honors the congestion flag is pdflush.
> > > > > It's trivial to get pdflush backed up and make it sit down without
> > > > > making any progress because once the queue congests, pdflush goes away.
> > > > 
> > > > Right. I guess that's more or less intentional - to give lowest priority
> > > > to periodic/background writeback.
> > > 
> > > IMO, this is the wrong design. Background writeback should
> > > have higher CPU/scheduler priority than normal tasks. If there is
> > > sufficient dirty pages in the system for background writeback to
> > > be active, it should be running *now* to start as much IO as it can
> > > without being held up by other, lower priority tasks.
> > 
> > I'd say that an fsync from mutt or vi should be done at a higher prio
> > than a background streaming writer.
> 
> I don't think you caught everything I said - synchronous IO is
> un-throttled. Background writeback should dump async IO to the
> elevator as fast as it can, then get the hell out of the way. If
> you've got a UP system, then the fsync can't be issued at the same
> time pdflush is running (same as right now), and if you've got a MP
> system then fsync can run at the same time. On the premise that sync
> IO is unthrottled and given that elevators queue and issue sync IO
> sperately to async writes, fsync latency would be entirely derived
> from the elevator queuing behaviour, not the CPU priority of
> pdflush.

I think we've agreed for a long time on this in general.  The congestion
backoff comment was originally about IO priorities (I thought ;) so I
was trying to keep talking around IO priority and not CPU/scheduler
time.  When we get things tuned to the point that process scheduling
matters, I'll be a very happy boy.

The big change from the new code is that we will fill the queue
with async IO.

I think this is good, and I think the congestion backoff didn't really
consistently keep available requests in the queue all the time in a lot
of workloads.  But, its still a change, and so we need to keep an eye on
it as we look at performance reports during .32.

> 
> Look at it this way - it is the responsibility of pdflush to keep
> the elevator full of background IO. It is the responsibility of
> the elevator to ensure that background IO doesn't starve all other
> types of IO. If pdflush doesn't run because it can't get CPU time,
> then background IO does not get issued, and system performance
> suffers as a result.

Most of the time that pdflush didn't get to run in my benchmark it's
because pdflush chose to give up the CPU, not because it was starving.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/