[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOH5QeB==iASFSeHWZXT0ESBCN5W6mJ8cugJih1uKpqJZbtgqQ@mail.gmail.com>
Date: Fri, 17 Sep 2021 23:38:31 +0800
From: yong w <yongw.pur@...il.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Tejun Heo <tj@...nel.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
wang.yong12@....com.cn, Cgroups <cgroups@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>, yang.yang29@....com.cn
Subject: Re: [PATCH v2] vmpressure: wake up work only when there is
registration event
Michal Hocko <mhocko@...e.com> 于2021年9月15日周三 下午8:42写道:
>
> On Tue 14-09-21 09:05:51, yongw.pur@...il.com wrote:
> > From: wangyong <wang.yong12@....com.cn>
> >
> > Use the global variable num_events to record the number of vmpressure
> > events registered by the system, and wake up work only when there is
> > registration event.
> > Usually, the vmpressure event is not registered in the system, this patch
> > can avoid waking up work and doing nothing.
>
> I have asked in the previous version and this changelog doesn't that
> explain again. Why don't you simply bail out early in vmpressure()
> entry?
Thanks for your reply.
Because the else branch will modify the socket_pressure, and will not wake up
the work. It is necessary to judge the tree parameters at the same
time, like this
if (tree && !static_branch_unlikely(&num_events))
return;
It's not good to judge the tree twice parameters in the function.
>
> > Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
> > Consume cgroup memory until it is about to be reclaimed, then execute
> > "perf stat -I 2000 malloc.out" command to trigger memory reclamation
> > and get performance results.
> > The context-switches is reduced by about 20 times.
>
> Is this test somewhere available so that it can be reproduced by
> others. Also while the number of context switches can be an interesting
> it is not really clear from this evaluation whether that actually
> matters or not. E.g. what does an increase of task-clock and twice as
> many instructions recorded tell us?
>
The test program is a simple malloc process, which allocate memory
and fill some data.
I think it may be that more instructions can be executed per unit time.
> > unpatched:
> > Average of 10 test results
> > 582.4674048 task-clock(msec)
> > 19910.8 context-switches
> > 0 cpu-migrations
> > 1292.9 page-faults
> > 414784733.1 cycles
>
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
>
> Why is this a part of the data?
>
> > 580070698.4 instructions
> > 125572244.7 branches
> > 2073541.2 branch-misses
> >
> > patched
> > Average of 10 test results
> > 973.6174796 task-clock(msec)
> > 988.6 context-switches
> > 0 cpu-migrations
> > 1785.2 page-faults
> > 772883602.4 cycles
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 1360280911 instructions
> > 290519434.9 branches
> > 3378378.2 branch-misses
> >
> > Tested-by: Zeal Robot <zealci@....com.cn>
> > Signed-off-by: wangyong <wang.yong12@....com.cn>
> > ---
> >
> [...]
> > @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> > return;
> >
> > if (tree) {
> > + if (!static_branch_unlikely(&num_events))
> > + return;
>
> We usually hide the change behind a static inline helper (e.g.
> vmpressure_disabled()). I would also put it to the beginning of
> vmpressure or put an explanation why it makes sense only in this branch.
> --
Because only this branch needs to wake up work.
Yes, static inline helper is more easier to read and understand.
Powered by blists - more mailing lists