[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1601271507510.1248@chino.kir.corp.google.com>
Date: Wed, 27 Jan 2016 15:13:12 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Joonsoo Kim <iamjoonsoo.kim@....com>
cc: Christoph Lameter <cl@...ux.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
On Thu, 26 Nov 2015, Joonsoo Kim wrote:
> I understand design decision, but, it is better to get value as much
> as accurate if there is no performance problem. My patch would not
> cause much performance degradation because it is just adding one
> this_cpu_read().
>
> Consider about following example. Current implementation returns
> interesting output if someone do following things.
>
> v1 = zone_page_state(XXX);
> mod_zone_page_state(XXX, 1);
> v2 = zone_page_state(XXX);
>
> v2 would be same with v1 in most of cases even if we already update
> it.
>
> This situation could occurs in page allocation path and others. If
> some task try to allocate many pages, then watermark check returns
> same values until updating vmstat even if some freepage are allocated.
> There are some adjustments for this imprecision but why not do it become
> accurate? I think that this change is reasonable trade-off.
>
I'm not sure that NR_ISOLATED_* should be vmstats in the first place. The
most important callers that depend on its accuracy is
zone_reclaimable_pages() and the too_many_isolated() loop in both
shrink_inactive_list() and memory compaction. If zlc's are updated every
1s, the HZ/10 in those loops don't really matter, they may as well be
HZ/2.
I think memory compaction updates the counters in the most appropriate
way, by incrementing a counter and then finally doing
mod_zone_page_state() for the counter. The other updaters are thp
collapse and page migration.
I discount user-visible vmstats here because the trade-off has already
been made that they may be stale for up to 1s and userspace isn't
affected.
So what happens if we simply convert NR_ISOLATED_* into per-zone
atomic64_t?
Powered by blists - more mailing lists