[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTinwmtgh+p=aeZux3NuC2ftbR5OMgQ@mail.gmail.com>
Date: Sat, 21 May 2011 09:41:50 +0900
From: Hiroyuki Kamezawa <kamezawa.hiroyuki@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"nishimura@....nes.nec.co.jp" <nishimura@....nes.nec.co.jp>,
"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
Ying Han <yinghan@...gle.com>, hannes@...xchg.org,
Michal Hocko <mhocko@...e.cz>
Subject: Re: [PATCH 8/8] memcg asyncrhouns reclaim workqueue
2011/5/21 Andrew Morton <akpm@...ux-foundation.org>:
> On Fri, 20 May 2011 12:48:37 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
>
>> workqueue for memory cgroup asynchronous memory shrinker.
>>
>> This patch implements the workqueue of async shrinker routine. each
>> memcg has a work and only one work can be scheduled at the same time.
>>
>> If shrinking memory doesn't goes well, delay will be added to the work.
>>
>
> When this code explodes (as it surely will), users will see large
> amounts of CPU consumption in the work queue thread. We want to make
> this as easy to debug as possible, so we should try to make the
> workqueue's names mappable back onto their memcg's. And anything else
> we can think of to help?
>
I had a patch for showing per-memcg reclaim latency stats. It will be help.
I'll add it again to this set. I just dropped it because there are many patches
onto memory.stat in flight..
>>
>> ...
>>
>> +static void mem_cgroup_async_shrink(struct work_struct *work)
>> +{
>> + struct delayed_work *dw = to_delayed_work(work);
>> + struct mem_cgroup *mem = container_of(dw,
>> + struct mem_cgroup, async_work);
>> + bool congested = false;
>> + int delay = 0;
>> + unsigned long long required, usage, limit, shrink_to;
>
> There's a convention which is favored by some (and ignored by the
> clueless ;)) which says "one definition per line".
>
> The reason I like one-definition-per-line is that it leaves a little
> room on the right where the programmer can explain the role of the
> local.
>
> Another advantage is that one can initialise it. eg:
>
> unsigned long limit = res_counter_read_u64(&mem->res, RES_LIMIT);
>
> That conveys useful information: the reader can see what it's
> initialised with and can infer its use.
>
> A third advantage is that it can now be made const, which conveys very
> useful informtation and can prevent bugs.
>
> A fourth advantage is that it makes later patches to this function more
> readable and easier to apply when there are conflicts.
>
ok, I will fix.
>
>> + limit = res_counter_read_u64(&mem->res, RES_LIMIT);
>> + shrink_to = limit - MEMCG_ASYNC_MARGIN - PAGE_SIZE;
>> + usage = res_counter_read_u64(&mem->res, RES_USAGE);
>> + if (shrink_to <= usage) {
>> + required = usage - shrink_to;
>> + required = (required >> PAGE_SHIFT) + 1;
>> + /*
>> + * This scans some number of pages and returns that memory
>> + * reclaim was slow or now. If slow, we add a delay as
>> + * congestion_wait() in vmscan.c
>> + */
>> + congested = mem_cgroup_shrink_static_scan(mem, (long)required);
>> + }
>> + if (test_bit(ASYNC_NORESCHED, &mem->async_flags)
>> + || mem_cgroup_async_should_stop(mem))
>> + goto finish_scan;
>> + /* If memory reclaim couldn't go well, add delay */
>> + if (congested)
>> + delay = HZ/10;
>
> Another magic number.
>
> If Moore's law holds, we need to reduce this number by 1.4 each year.
> Is this good?
>
not good. I just used the same magic number now used with wait_iff_congested.
Other than timer, I can use pagein/pageout event counter. If we have
dirty_ratio,
I may able to link this to dirty_ratio and wait until dirty_ratio is enough low.
Or, wake up again hit limit.
Do you have suggestion ?
>> + queue_delayed_work(memcg_async_shrinker, &mem->async_work, delay);
>> + return;
>> +finish_scan:
>> + cgroup_release_and_wakeup_rmdir(&mem->css);
>> + clear_bit(ASYNC_RUNNING, &mem->async_flags);
>> + return;
>> +}
>> +
>> +static void run_mem_cgroup_async_shrinker(struct mem_cgroup *mem)
>> +{
>> + if (test_bit(ASYNC_NORESCHED, &mem->async_flags))
>> + return;
>
> I can't work out what ASYNC_NORESCHED does. Is its name well-chosen?
>
how about BLOCK/STOP_ASYNC_RECLAIM ?
>> + if (test_and_set_bit(ASYNC_RUNNING, &mem->async_flags))
>> + return;
>> + cgroup_exclude_rmdir(&mem->css);
>> + /*
>> + * start reclaim with small delay. This delay will allow us to do job
>> + * in batch.
>
> Explain more?
>
yes, or I'll change this logic. I wanted to do low/high watermark
without "low" watermark...
>> + */
>> + if (!queue_delayed_work(memcg_async_shrinker, &mem->async_work, 1)) {
>> + cgroup_release_and_wakeup_rmdir(&mem->css);
>> + clear_bit(ASYNC_RUNNING, &mem->async_flags);
>> + }
>> + return;
>> +}
>> +
>>
>> ...
>>
>
Thank you for review. I realized I need some amount of works. I'll add texts to
explain behavior and make codes simpler.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists