linux-kernel - RE: [PATCH -v2 -mm] add extra free kbytes tunable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1110131337580.24853@chino.kir.corp.google.com>
Date:	Thu, 13 Oct 2011 13:48:13 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Satoru Moriya <satoru.moriya@....com>
cc:	Con Kolivas <kernel@...ivas.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Randy Dunlap <rdunlap@...otime.net>,
	Satoru Moriya <smoriya@...hat.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	"lwoodman@...hat.com" <lwoodman@...hat.com>,
	Seiji Aguchi <saguchi@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	"hannes@...xchg.org" <hannes@...xchg.org>
Subject: RE: [PATCH -v2 -mm] add extra free kbytes tunable

On Thu, 13 Oct 2011, Satoru Moriya wrote:

> My test case is just a simple one (maybe too simple), and I tried
> to demonstrate following issues that current kernel has with it.
> 
> 1. Current kernel uses free memory as pagecache.
> 2. Applications may allocate memory burstly and when it happens
>    they may get a latency issue because there are not enough free
>    memory. Also the amount of required memory is wide-ranging.

This is what the per-zone watermarks are intended to address and I 
understand that it's not doing a good enough job for your particular 
workloads.  I'm trying to find a solution that mitigates that for all 
threads that allocate faster than the kernel can reclaim, realtime or 
otherwise, without requiring the admin to set those watermarks himself, 
which is really what extra_free_kbytes is eventually leading to.

> 3. Some users would like to control the amount of free memory
>    to avoid the situation above.

The only possible way to do that is with min_free_kbytes right now and 
that would increase the amount of memory that realtime threads have 
exclusive access to.  Let's try not to add additional tunables so that 
admins need to find their own optimal watermarks for every kernel release.  
I see no reason why we can't add logic for rt-threads triggering reclaim 
to either reclaim faster (Con's patch) or more memory than normal (an 
ALLOC_HARDER type bonus in the reclaim path to reclaim 1.25 * high_wmark, 
for example).  We've had a rt-thread bonus in the page allocator for a 
long time, I'm not saying we don't need more elsewhere.

> 4. User can't setup the amount of free memory explicitly.
>    From user's point of view, the amount of free memory is the delta
>    between high watermark - min watermark because below min watermark
>    user applications incur a penalty (direct reclaim). The width of
>    delta depends on min_free_kbytes, actually min watermark / 2, and
>    so if we want to make free memory bigger, we must make
>    min_free_kbytes bigger. It's not a intuitive and it introduces
>    another problem that is possibility of direct reclaim is increased.
> 

So you're saying that we need to increase the space between high_wmark and 
min_wmark anytime that min_free_kbytes changes?  That certainly may be 
true and would hopefully mitigate direct reclaim becoming too intrusive 
for your workload.

We _really_ don't want to cause regressions for others, though, which 
extra_free_kbytes can easily do for cpu-intensive workloads if nothing is 
currently requiring that extra burst of memory (and occurs because 
extra_free_kbytes is a global tunable and not tied to any specific 
application [like testing for rt_task()] that we can identify when 
reclaiming).

> But my concern described above is still alive because whether
> latency issue happen or not depends on how heavily workloads
> allocate memory at a short time. Of cource we can say same
> things for extra_free_kbytes, but we can change it and test
> an effect easily.
> 

We'll never know the future and how much memory a latency-sensitive 
application will require 100ms from now.  The only thing that we can do is 
(i) identify the latency-sensitive app, (ii) reclaim more aggressively for 
them, and (iii) reclaim additional memory in preparation for another 
burst.  At some point, though, userspace needs to be responsible to not 
allocate enormous amounts of memory all at once and there's room for 
mitigation there too to preallocate ahead of what you actually need.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/