[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55FC24C2.8020501@intel.com>
Date: Fri, 18 Sep 2015 07:50:42 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Greg Thelen <gthelen@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>, Tejun Heo <tj@...nel.org>,
Jens Axboe <axboe@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>,
"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)"
<cgroups@...r.kernel.org>,
"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)"
<linux-mm@...ck.org>, open list <linux-kernel@...r.kernel.org>
Subject: Re: 4.3-rc1 dirty page count underflow (cgroup-related?)
On 09/17/2015 11:09 PM, Greg Thelen wrote:
> I'm not denying the issue, bug the WARNING splat isn't necessarily
> catching a problem. The corresponding code comes from your debug patch:
> + WARN_ONCE(__this_cpu_read(memcg->stat->count[MEM_CGROUP_STAT_DIRTY]) > (1UL<<30), "MEM_CGROUP_STAT_DIRTY bogus");
>
> This only checks a single cpu's counter, which can be negative. The sum
> of all counters is what matters.
> Imagine:
> cpu1) dirty page: inc
> cpu2) clean page: dec
> The sum is properly zero, but cpu2 is -1, which will trigger the WARN.
>
> I'll look at the code and also see if I can reproduce the failure using
> mem_cgroup_read_stat() for all of the new WARNs.
D'oh. I'll replace those with the proper mem_cgroup_read_stat() and
test with your patch to see if anything still triggers.
> Did you notice if the global /proc/meminfo:Dirty count also underflowed?
It did not underflow. It was one of the first things I looked at and it
looked fine, went down near 0 at 'sync', etc...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists