linux-kernel - Re: [RFC] Limit the size of the pagecache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <45B8EF74.6010704@in.ibm.com>
Date:	Thu, 25 Jan 2007 23:27:08 +0530
From:	Balbir Singh <balbir@...ibm.com>
To:	Rik van Riel <riel@...hat.com>
CC:	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Christoph Lameter <clameter@....com>,
	Aubrey Li <aubreylee@...il.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Robin Getz <rgetz@...ckfin.uclinux.org>,
	"Henn, erich, Michael" <Michael.Hennerich@...log.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] Limit the size of the pagecache

Rik van Riel wrote:
> Vaidyanathan Srinivasan wrote:
>> Rik van Riel wrote:
> 
>>> There are a few databases out there that mmap the whole
>>> thing.  Sleepycat for one...
>>
>> That is why my suggestion would be not to touch mmapped pagecache
>> pages in the current pagecache limit code.  The limit should concern
>> only unmapped pagecache pages.
> 
> So you want to limit how much data the kernel caches for mysql
> or postgresql, but not limit how much of the rpm database is
> cached ?!
> 
> IMHO your proposal does the exact opposite of what would be
> right for my systems :)
> 

<Jumping in late into the discussion>

One scenario I can think of is

A group of I/O intensive task can cause readahead and
dirty page I/O and make good forward progress, but
they'll hit another group of processes by swapping
their pages out. How do we make fair forward progress?
The system administrator can currently control the
amount of swappiness by setting it, but swappiness is
a reclaim time control parameter.

We can control dirty page I/O by setting vm_dirty_ratio.
Readahead is also tuneable with fadvise(), but not many
applications use fadvise.

The question now is, is it easier for the system administrator
to say, limit my page cache usage to say 30% of total memory available,
so that other allocations do not have to wait on disk I/O or page
reclaim (consider slab allocations, other kernel data structures).

A low priority task might run infrequently and end up spending all
it's time either swapping in pages or reclaiming memory and by
the time it runs again, it ends up doing the same thing.

I understand the swap token mitigates this problem to some extent,
but limiting the page cache will give the system administrator
control over system memory behaviour.

-- 
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/