[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200720202021.GE139672@carbon.dhcp.thefacebook.com>
Date: Mon, 20 Jul 2020 13:20:21 -0700
From: Roman Gushchin <guro@...com>
To: Michal Hocko <mhocko@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>, <linux-mm@...ck.org>,
<kernel-team@...com>, <linux-kernel@...r.kernel.org>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating
false warnings
On Mon, Jul 20, 2020 at 10:03:49AM +0200, Michal Hocko wrote:
> On Tue 14-07-20 10:39:20, Roman Gushchin wrote:
> > I've noticed a number of warnings like "vmstat_refresh: nr_free_cma
> > -5" or "vmstat_refresh: nr_zone_write_pending -11" on our production
> > hosts. The numbers of these warnings were relatively low and stable,
> > so it didn't look like we are systematically leaking the counters.
> > The corresponding vmstat counters also looked sane.
> >
> > These warnings are generated by the vmstat_refresh() function, which
> > assumes that atomic zone and numa counters can't go below zero.
> > However, on a SMP machine it's not quite right: due to per-cpu
> > caching it can in theory be as low as -(zone threshold) * NR_CPUs.
> >
> > For instance, let's say all cma pages are in use and NR_FREE_CMA_PAGES
> > reached 0. Then we've reclaimed a small number of cma pages on each
> > CPU except CPU0, so that most percpu NR_FREE_CMA_PAGES counters are
> > slightly positive (the atomic counter is still 0). Then somebody on
> > CPU0 consumes all these pages. The number of pages can easily exceed
> > the threshold and a negative value will be committed to the atomic
> > counter.
> >
> > To fix the problem and avoid generating false warnings, let's just
> > relax the condition and warn only if the value is less than minus
> > the maximum theoretically possible drift value, which is 125 *
> > number of online CPUs. It will still allow to catch systematic leaks,
> > but will not generate bogus warnings.
> >
> > Signed-off-by: Roman Gushchin <guro@...com>
> > Cc: Hugh Dickins <hughd@...gle.com>
>
> Acked-by: Michal Hocko <mhocko@...e.com>
>
> One minor nit which can be handled by a separate patch but now that you
> are touching this code...
Thank you!
>
> > ---
> > Documentation/admin-guide/sysctl/vm.rst | 4 ++--
> > mm/vmstat.c | 30 ++++++++++++++++---------
> > 2 files changed, 21 insertions(+), 13 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > index 4b9d2e8e9142..95fb80d0c606 100644
> > --- a/Documentation/admin-guide/sysctl/vm.rst
> > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > @@ -822,8 +822,8 @@ e.g. cat /proc/sys/vm/stat_refresh /proc/meminfo
> >
> > As a side-effect, it also checks for negative totals (elsewhere reported
> > as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
> > -(At time of writing, a few stats are known sometimes to be found negative,
> > -with no ill effects: errors and warnings on these stats are suppressed.)
> > +(On a SMP machine some stats can temporarily become negative, with no ill
> > +effects: errors and warnings on these stats are suppressed.)
> >
> >
> > numa_stat
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index a21140373edb..8f0ef8aaf8ee 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -169,6 +169,8 @@ EXPORT_SYMBOL(vm_node_stat);
> >
> > #ifdef CONFIG_SMP
> >
> > +#define MAX_THRESHOLD 125
>
> This would deserve a comment. 88f5acf88ae6a didn't really explain why
> this specific value has been selected but the specific value shouldn't
> really matter much. I would go with the following at least.
> "
> Maximum sync threshold for per-cpu vmstat counters.
> "
Agree. Below is the diff to be squashed into the original patch.
Thanks!
--
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 08e415e0a15d..ddc59b533599 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -167,6 +167,9 @@ EXPORT_SYMBOL(vm_zone_stat);
EXPORT_SYMBOL(vm_numa_stat);
EXPORT_SYMBOL(vm_node_stat);
+/*
+ * Maximum sync threshold for per-cpu vmstat counters.
+ */
#ifdef CONFIG_SMP
#define MAX_THRESHOLD 125
#else
Powered by blists - more mailing lists