[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200706023614.GA1231@lca.pw>
Date: Sun, 5 Jul 2020 22:36:14 -0400
From: Qian Cai <cai@....pw>
To: Feng Tang <feng.tang@...el.com>
Cc: kernel test robot <rong.a.chen@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Kees Cook <keescook@...omium.org>,
Luis Chamberlain <mcgrof@...nel.org>,
Iurii Zaikin <yzaikin@...gle.com>, andi.kleen@...el.com,
tim.c.chen@...el.com, dave.hansen@...el.com, ying.huang@...el.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, lkp@...ts.01.org
Subject: Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail
On Mon, Jul 06, 2020 at 09:43:13AM +0800, Feng Tang wrote:
> On Sun, Jul 05, 2020 at 11:52:32AM -0400, Qian Cai wrote:
> > On Sun, Jul 05, 2020 at 08:58:54PM +0800, Feng Tang wrote:
> > > On Sun, Jul 05, 2020 at 08:15:03AM -0400, Qian Cai wrote:
> > > >
> > > >
> > > > > On Jul 5, 2020, at 12:45 AM, Feng Tang <feng.tang@...el.com> wrote:
> > > > >
> > > > > I did reproduce the problem, and from the debugging, this should
> > > > > be the same root cause as lore.kernel.org/lkml/20200526181459.GD991@....pw/
> > > > > that loosing the batch cause some accuracy problem, and the solution of
> > > > > adding some sync is still needed, which is dicussed in
> > > >
> > > > Well, before taking any of those patches now to fix the regression,
> > > > we will need some performance data first. If it turned out the
> > > > original performance gain is no longer relevant anymore due to this
> > > > regression fix on top, it is best to drop this patchset and restore
> > > > that VM_WARN_ONCE, so you can retry later once you found a better
> > > > way to optimize.
> > >
> > > The fix of adding sync only happens when the memory policy is being
> > > changed to OVERCOMMIT_NEVER, which is not a frequent operation in
> > > normal cases.
> > >
> > > For the performance improvment data both in commit log and 0day report
> > > https://lore.kernel.org/lkml/20200622132548.GS5535@shao2-debian/
> > > it is for the will-it-scale's mmap testcase, which will not runtime
> > > change memory overcommit policy, so the data should be still valid
> > > with this fix.
> >
> > Well, I would expect people are perfectly reasonable to use
> > OVERCOMMIT_NEVER for some workloads making it more frequent operations.
>
> In my last email, I was not saying OVERCOMMIT_NEVER is not a normal case,
> but I don't think user will too frequently runtime change the overcommit
> policy. And the fix patch of syncing 'vm_committed_as' is only called when
> user calls 'sysctl -w vm.overcommit_memory=2'.
>
> > The question is now if any of those regression fixes would now regress
> > performance of OVERCOMMIT_NEVER workloads or just in-par with the data
> > before the patchset?
>
> For the original patchset, it keeps vm_committed_as unchanged for
> OVERCOMMIT_NEVER policy and enlarge it for the other 2 loose policies
> OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS, and I don't expect the "OVERCOMMIT_NEVER
> workloads" performance will be impacted. If you have suggetions for this
> kind of benchmarks, I can test them to better verify the patchset, thanks!
Then, please capture those information into a proper commit log when you
submit the regression fix on top of the patchset, and CC PER-CPU MEMORY
ALLOCATOR maintainers, so they might be able to review it properly.
>
> - Feng
>
> >
> > Given now this patchset has had so much churn recently, I would think
> > "should be still valid" is not really the answer we are looking for.
> >
> > >
> > > Thanks,
> > > Feng
> > >
> > >
Powered by blists - more mailing lists