linux-kernel - Re: [patch] Revert "memcg: add memory.vmscan

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALWz4iyKXx+q5uKVOFqDs3Xx7ZGOertJ-ZWkwO=Z0Ynr4qsm2A@mail.gmail.com>
Date:	Thu, 1 Sep 2011 00:04:24 -0700
From:	Ying Han <yinghan@...gle.com>
To:	Johannes Weiner <jweiner@...hat.com>
Cc:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Balbir Singh <bsingharora@...il.com>,
	Michal Hocko <mhocko@...e.cz>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] Revert "memcg: add memory.vmscan_stat"

On Wed, Aug 31, 2011 at 11:40 PM, Johannes Weiner <jweiner@...hat.com> wrote:
> On Wed, Aug 31, 2011 at 11:05:51PM -0700, Ying Han wrote:
>> On Tue, Aug 30, 2011 at 1:42 AM, Johannes Weiner <jweiner@...hat.com> wrote:
>> > You want to look at A and see whether its limit was responsible for
>> > reclaim scans in any children.  IMO, that is asking the question
>> > backwards.  Instead, there is a cgroup under reclaim and one wants to
>> > find out the cause for that.  Not the other way round.
>> >
>> > In my original proposal I suggested differentiating reclaim caused by
>> > internal pressure (due to own limit) and reclaim caused by
>> > external/hierarchical pressure (due to limits from parents).
>> >
>> > If you want to find out why C is under reclaim, look at its reclaim
>> > statistics.  If the _limit numbers are high, C's limit is the problem.
>> > If the _hierarchical numbers are high, the problem is B, A, or
>> > physical memory, so you check B for _limit and _hierarchical as well,
>> > then move on to A.
>> >
>> > Implementing this would be as easy as passing not only the memcg to
>> > scan (victim) to the reclaim code, but also the memcg /causing/ the
>> > reclaim (root_mem):
>> >
>> >        root_mem == victim -> account to victim as _limit
>> >        root_mem != victim -> account to victim as _hierarchical
>> >
>> > This would make things much simpler and more natural, both the code
>> > and the way of tracking down a problem, IMO.
>>
>> This is pretty much the stats I am currently using for debugging the
>> reclaim patches. For example:
>>
>> scanned_pages_by_system 0
>> scanned_pages_by_system_under_hierarchy 50989
>>
>> scanned_pages_by_limit 0
>> scanned_pages_by_limit_under_hierarchy 0
>>
>> "_system" is count under global reclaim, and "_limit" is count under
>> per-memcg reclaim.
>> "_under_hiearchy" is set if memcg is not the one triggering pressure.
>
> I don't get this distinction between _system and _limit.  How is it
> orthogonal to _limit vs. _hierarchy, i.e. internal vs. external?

Something like :

+enum mem_cgroup_scan_context {
+       SCAN_BY_SYSTEM,
+       SCAN_BY_SYSTEM_UNDER_HIERARCHY,
+       SCAN_BY_LIMIT,
+       SCAN_BY_LIMIT_UNDER_HIERARCHY,
+       NR_SCAN_CONTEXT,
+};

if (global_reclaim(sc))
   context = scan_by_system
else
   context = scan_by_limit

if (target != mem)
   context++;

>
> If the system scans memcgs then no limit is at fault.  It's just
> external pressure.
>
> For example, what is the distinction between scanned_pages_by_system
> and scanned_pages_by_system_under_hierarchy?

you are right about this, there is no much difference on these since
it is counting global reclaim and everyone
is under_hierarchy except root_cgroup. For root cgroup, it is counted
in "_system". (internal)

The reason for scanned_pages_by_system would be, per your definition,
neither due to
> the limit (_by_system -> global reclaim) nor not due to the limit
> (!_under_hierarchy -> memcg is the one triggering pressure)

This value "scanned_pages_by_system" only making senses for root
cgroup, which now could be counted as "# of pages scanned in root lru
under global reclaim".

--Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/