linux-kernel - Re: [PATCH 1/2] Add mempressure cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 08 Jan 2013 17:24:32 +0900
From:	Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Anton Vorontsov <anton.vorontsov@...aro.org>
CC:	David Rientjes <rientjes@...gle.com>,
	Pekka Enberg <penberg@...nel.org>,
	Mel Gorman <mgorman@...e.de>,
	Glauber Costa <glommer@...allels.com>,
	Michal Hocko <mhocko@...e.cz>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Luiz Capitulino <lcapitulino@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Thelen <gthelen@...gle.com>,
	Leonid Moiseichuk <leonid.moiseichuk@...ia.com>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Minchan Kim <minchan@...nel.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	John Stultz <john.stultz@...aro.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
	patches@...aro.org, kernel-team@...roid.com
Subject: Re: [PATCH 1/2] Add mempressure cgroup

(2013/01/08 16:29), Anton Vorontsov wrote:
> On Mon, Jan 07, 2013 at 05:51:46PM +0900, Kamezawa Hiroyuki wrote:
> [...]
>> I'm just curious..
>
> Thanks for taking a look! :)
>
> [...]
>>> +/*
>>> + * The window size is the number of scanned pages before we try to analyze
>>> + * the scanned/reclaimed ratio (or difference).
>>> + *
>>> + * It is used as a rate-limit tunable for the "low" level notification,
>>> + * and for averaging medium/oom levels. Using small window sizes can cause
>>> + * lot of false positives, but too big window size will delay the
>>> + * notifications.
>>> + */
>>> +static const uint vmpressure_win = SWAP_CLUSTER_MAX * 16;
>>> +static const uint vmpressure_level_med = 60;
>>> +static const uint vmpressure_level_oom = 99;
>>> +static const uint vmpressure_level_oom_prio = 4;
>>> +
>>
>> Hmm... isn't this window size too small ?
>> If vmscan cannot find a reclaimable page while scanning 2M of pages in a zone,
>> oom notify will be returned. Right ?
>
> Yup, you are right, if we were not able to find anything within the window
> size (which is 2M, but see below), then it is effectively the "OOM level".
> The thing is, the vmpressure reports... the pressure. :) Or, the
> allocation cost, and if the cost becomes high, it is no good.
>
> The 2M is, of course, not ideal. And the "ideal" depends on many factors,
> alike to vmstat. And, actually I dream about deriving the window size from
> zone->stat_threshold, which would make the window automatically adjustable
> for different "machine sizes" (as we do in calculate_normal_threshold(),
> in vmstat.c).
>
> But again, this is all "implementation details"; tunable stuff that we can
> either adjust ourselves as needed, or try to be smart, i.e. apply some
> heuristics, again, as in vmstat.
>

Hmm, I like automatic adjustment for things like this (but may be need to be tunable by
user). My concern is, for example, that if a qemu-kvm with pci-passthrough running on
a node using the most of memory on it, the interface will say "Hey it's near to OOM"
to users. We may need a complicated heuristics ;)

Anyway, your approach seems interesting to me but it seems peaky to usual users.
Uses should know what they should check (vmstat, zoneinfo, malloc latency ??) when they
get notify before rising real alarm. (not explained in the doc.)
For example, if the user takes care of usage of swap, he should check it.

I'm glad if you explain in Doc that this interface just makes a hint and notify status
of _recent_ vmscans of some amount of window. That means latency of recent memory allocations.
Users should confirm the real status and make the final judge by themselves.
The point is that this notify is important because it's quick and related to ongoing memory
allocation latency. But kernel is not sure there are long-standing heavy vm pressure.

I'm sorry if I misundestand the concept.

Thank you,
-Kame



  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/