[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170706164312.4nnjsrpzv5vbtbkm@techsingularity.net>
Date: Thu, 6 Jul 2017 17:43:12 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: Debabrata Banerjee <dbavatar@...il.com>
Cc: Roman Gushchin <guro@...com>, linux-mm@...ck.org,
Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Rik van Riel <riel@...hat.com>, kernel-team@...com,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: make allocation counters per-order
On Thu, Jul 06, 2017 at 12:12:47PM -0400, Debabrata Banerjee wrote:
> On Thu, Jul 6, 2017 at 11:51 AM, Mel Gorman <mgorman@...hsingularity.net> wrote:
> >
> > These counters do not actually help you solve that particular problem.
> > Knowing how many allocations happened since the system booted doesn't tell
> > you much about how many failed or why they failed. You don't even know
> > what frequency they occured at unless you monitor it constantly so you're
> > back to square one whether this information is available from proc or not.
> > There even is a tracepoint that can be used to track information related
> > to events that degrade fragmentation (trace_mm_page_alloc_extfrag) although
> > the primary thing it tells you is that "the probability that an allocation
> > will fail due to fragmentation in the future is potentially higher".
>
> I agree these counters don't have enough information, but there a
> start to a first order approximation of the current state of memory.
That incurs a universal cost on the off-chance of debugging and ultimately
the debugging is only useful in combination with developing kernel patches
in which case it could be behind a kconfig option.
> buddyinfo and pagetypeinfo basically show no information now, because
They can be used to calculate a fragmentation index at a given point in
time. Admittedly, building a bigger picture requires a full scan of memory
(and that's what was required when fragmentation avoidance was first
being implemented).
> they only involve the small amount of free memory under the watermark
> and all our machines are in this state. As second order approximation,
> it would be nice to be able to get answers like: "There are
> reclaimable high order allocations of at least this order" and "None
> of this order allocation can become available due to unmovable and
> unreclaimable allocations"
Which this patch doesn't provide as what you are looking for requires
a full scan of memory to determine. I've done it in the past using a
severe abuse of systemtap to load a module that scans all of memory with
a variation of PAGE_OWNER to identify stack traces of pages that "don't
belonw" within a pageblock.
Even *with* that information, your options for tuning an unmodified kernel
are basically limited to increasing min_free_kbytes, altering THP's level
of aggression when compacting or brute forcing with either drop_caches,
compact_node or both. All other options after that require kernel patches
-- altering annotations, altering fallback mechanisms, altering compaction,
improving support for pages that can be migrated etc.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists