lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <65795E11DBF1E645A09CEC7EAEE94B9C3A30A295@USINDEVS02.corp.hds.com>
Date:	Fri, 7 Jan 2011 17:03:47 -0500
From:	Satoru Moriya <satoru.moriya@....com>
To:	"linux-mm@...ck.org" <linux-mm@...ck.org>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"mel@....ul.ie" <mel@....ul.ie>,
	"kosaki.motohiro@...fujitsu.com" <kosaki.motohiro@...fujitsu.com>,
	"rdunlap@...otime.net" <rdunlap@...otime.net>,
	"dle-develop@...ts.sourceforge.net" 
	<dle-develop@...ts.sourceforge.net>,
	Seiji Aguchi <seiji.aguchi@....com>
Subject: [RFC][PATCH 0/2] Tunable watermark

This patchset introduces a new knob to control each watermark
separately.

[Purpose]
To control the timing at which kswapd/direct reclaim starts(ends)
based on memory pressure and/or application characteristics
because direct reclaim makes a memory alloc/access latency worse.
(We'd like to avoid direct reclaim to keep latency low even if
 under the high memory pressure.)

[Problem]
The thresholds kswapd/direct reclaim starts(ends) depend on
watermark[min,low,high] and currently all watermarks are set
based on min_free_kbytes. min_free_kbytes is the amount of
free memory that Linux VM should keep at least.

This means the difference between thresholds at which kswapd
starts and direct reclaim starts depends on the amount of free
memory.

On the other hand, the amount of required memory depends on
applications. Therefore when it allocates/access memory more
than the difference between watemark[low] and watermark[min],
kernel sometimes runs direct reclaim before allocation and
it makes application latency bigger.

[Solution]
To avoid the situation above, this patch set introduces new
tunables /proc/sys/vm/wmark_min_kbytes, wmark_low_kbytes and
wmark_high_kbytes. Each entry controls watermark[min],
watermark[low] and watermark[high] separately.
By using these parameters one can make the difference between
min and low bigger than the amount of memory which applications
require.

[Example]
This is an example of the problem and solution above.

- System Memory: 2GB
- High memory pressure

In this case, min_free_kbytes and watermarks are automatically
set as follows.
(Here, watermark shows sum of the each zone's watermark.)

min_free_kbytes: 5752
watermark[min] : 5752
watermark[low] : 7190
watermark[high]: 8628

If application allocates/accesses 2000 kbytes memory (bigger
than 1438(= 7190 - 5752)), direct reclaim may occur.

By introducing this patch, one can set watermark[low] to bigger
than 7752 which makes the difference between min and low bigger
than 2000. This results in avoidance of direct reclaim without
changing watermark[min].

[Test]
I ran a simple test like below:

System memory: 2GB

$ dd if=/dev/zero of=/tmp/tmp_file &
$ time mapped-file-stream 1 $((1024 * 1024 * 64))

The result is following.

                  | default |  case 1   |  case 2 |
----------------------------------------------------------
wmark_min_kbytes  |  5752   |    5752   |   5752  |
wmark_low_kbytes  |  7190   |   16384   |  32768  | (KB)
wmark_high_kbytes |  8628   |   20480   |  40960  |
----------------------------------------------------------
real              |   503   |    364    |    337  |
user              |     3   |      5    |      4  | (msec)
sys               |   153   |    149    |    146  |
----------------------------------------------------------
page fault        |  32768  |  32768    |  32768  |
kswapd_wakeup     |   1809  |    335    |    228  | (times)
direct reclaim    |      5  |      0    |      0  |

As you can see, direct reclaim was performed 5 times and
its exec time was 503 msec in the default case. On the other
hand, in case 1 (large delta case ) no direct reclaim was
performed and its exec time was 364 msec.

(*) mapped-file-stream
     This is a micro benchmark from Johannes Weiner that accesses a
     large sparse-file through mmap().
     http://lkml.org/lkml/2010/8/30/226

Any comments or suggestions are welcome	.


Satoru Moriya (2):
  Add explanation about min_free_kbytes to clarify its effect
  Make watermarks tunable separately

 Documentation/sysctl/vm.txt |   40 +++++++++++++++-
 include/linux/mmzone.h      |    6 ++
 kernel/sysctl.c             |   28 +++++++++++-
 mm/page_alloc.c             |  109 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 181 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ