lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 28 Sep 2011 16:48:44 -0700
From:	Michel Lespinasse <walken@...gle.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>,
	Balbir Singh <bsingharora@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Johannes Weiner <jweiner@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Hugh Dickins <hughd@...gle.com>,
	Michael Wolf <mjwolf@...ibm.com>
Subject: Re: [PATCH 2/9] kstaled: documentation and config option.

On Tue, Sep 27, 2011 at 11:53 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@...fujitsu.com> wrote:
> On Tue, 27 Sep 2011 17:49:00 -0700
> Michel Lespinasse <walken@...gle.com> wrote:
>> +* idle_2_clean, idle_2_dirty_file, idle_2_dirty_swap: same definitions as
>> +  above, but for pages that have been untouched for at least two scan cycles.
>> +* these fields repeat up to idle_240_clean, idle_240_dirty_file and
>> +  idle_240_dirty_swap, allowing one to observe idle pages over a variety
>> +  of idle interval lengths. Note that the accounting is cumulative:
>> +  pages counted as idle for a given interval length are also counted
>> +  as idle for smaller interval lengths.
>
> I'm sorry if you've answered already.
>
> Why 240 ? and above means we have idle_xxx_clean/dirty/ xxx is 'seq 2 240' ?
> Isn't it messy ? Anyway, idle_1_clean etc should be provided.

We don't have all values - we export values for 1, 2, 5, 15, 30, 60,
120 and 240 idle scan intervals.
In our production setup, the scan interval is set at 120 seconds.
The exported histogram values are chosen so that each is approximately
double as the previous, and they align with human units i.e. 30 scan
intervals == 1 hour.
We use one byte per page to track the number of idle cycles, which is
why we don't export anything over 255 scan intervals

> Hmm, I don't like the idea very much...
>
> IIUC, there is no kernel interface which shows histgram rather than load_avg[].
> Is there any other interface and what histgram is provided ?
> And why histgram by kernel is required ?

I don't think exporting per-page statistics is very useful given that
userspace doesn't have a way to select individual pages to reclaim
(and if it did, we would have to expose LRU lists to userspace for it
to make good choices, and I don't think we want to go there). So, we
want to expose summary statistics instead. Histograms are a good way
to do that.

I don't think averages would work well for this application - the
distribution of idle page ages varies a lot between applications and
can't be assumed to be even close to a gaussian.

> BTW, can't this information be exported by /proc/<pid>/smaps or somewhere ?
> I guess per-proc will be wanted finally.

The problem with per-proc is that it only works for things that are
mapped in at the time you look at the report. It does not take into
consideration ephemeral mappings (i.e. if there is this thing you run
every 5 minutes and it needs 1G of memory) or files you access with
read() instead of mmap().

> Hm, do you use params other than idle_clean for your scheduling ?

The management software currently looks at only one bin of the
histogram - for each job, we can configure which bin it will look at.
Humans look at the complete picture when looking into performance
issues, and we're always thinking about teaching the management
software to do that as well :)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ