lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 20 Jun 2007 01:58:26 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	davej@...hat.com, tim.c.chen@...ux.intel.com,
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org
Subject: Re: Change in default vm_dirty_ratio

> On Wed, 20 Jun 2007 10:35:36 +0200 Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, 2007-06-19 at 21:44 -0700, Andrew Morton wrote:
> 
> > Anyway, this is all arse-about.  What is the design?  What algorithms
> > do we need to implement to do this successfully?  Answer me that, then
> > we can decide upon these implementation details.
> 
> Building on the per BDI patches, how about integrating feedback from the
> full-ness of device queues. That is, when we are happily doing IO and we
> cannot possibly saturate the active devices (as measured by their queue
> never reaching 75%?) then we can safely increase the total dirty limit.
> 
> OTOH, when even with the per BDI dirty limit the device queue is
> constantly saturated (contended) we ought to lower the total dirty
> limit.
> 
> Lots of detail here to work out, but does this sound workable?

It's pretty easy to fill the queues - I'd expect that there are a lot of
not-very-heavy workloads which cause the kernel to shove a lot of little
writes into the queue when it visits the blockdev mapping: a shower of
inodes, directory entries, indirect blocks, etc.  With very little dirty
memory associated with it.

But back away further.

What do we actually want the kernel to *do*?  Stated in terms of "when the
dirty memory state is A, do B" and "when userspace does C, the kernel should
do D".

Top-level statement: "when userspace does anything, the kernel should not
suck" ;)  Some refinement is needed there.

I _think_ the problem is basically one of latency: a) writes starving reads
and b) dirty memory causing page reclaim to stall and c) inter-device
contention on the global memory limits.

Hard.  If the device isn't doing anything else then we can shove data at it
freely.  If reads (or synchronous writes) come in then perhaps the VM
should back off and permit dirty memory to go higher.

The anticipatory scheduler(s) are supposed to fix this.

Perhaps our queues are too long - if the VFS _does_ back off, it'll take
some time for that to have an effect.

Perhaps the fact that the queue size knows nothing about the _size_ of the
requests in the queue is a problem.


Back away even further here.

What user-visible problem(s) are we attemping to fix?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ