[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131028151746.GA7548@weiyang.vnet.ibm.com>
Date: Mon, 28 Oct 2013 23:17:46 +0800
From: Wei Yang <weiyang@...ux.vnet.ibm.com>
To: Tejun Heo <tj@...nel.org>
Cc: Wei Yang <weiyang@...ux.vnet.ibm.com>, cl@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] percpu: stop the loop when a cpu belongs to a new
group
On Mon, Oct 28, 2013 at 07:31:20AM -0400, Tejun Heo wrote:
>Hello,
>
>On Mon, Oct 28, 2013 at 11:00:55AM +0800, Wei Yang wrote:
>> >Does this actually matter? If so, it'd probably make a lot more sense
>> >to start inner loop at @cpu + 1 so that it becomes O(N).
>>
>> One of the worst case in my mind:
>>
>> CPU: 0 1 2 3 4 ...
>> Group: 0 1 2 3 4 ...
>> (sounds it is impossible in the real world)
>
>I was wondering whether you had an actual case where this actually
>matters or it's just something you thought of while reading the code.
Tejun,
Thanks for your comments.
I found this just in code review. :-)
>
>> Every time, when we encounter a new CPU and try to assign it to a group, we
>> found it belongs to a new group. The original logic will iterate on all old
>> CPUs again, while the new logic could skip this and assign it to a new group.
>>
>> Again, this is a tiny change, which doesn't matters a lot.
>
>I think it *could* matter because the current implementation is O(N^2)
>where N is the number of CPUs. On machines, say, with 4k CPU, it's
>gonna loop 16M times but then again even that takes only a few
>millisecs on modern machines.
I am not familiar with the real cases of the CPU numbers. Thanks for leting me
know there could be 4K CPUs.
Yep, a few millisecs sounds not a big a mount.
>
>> BTW, I don't get your point for "start inner loop at @cpu+1".
>>
>> The original logic is:
>> loop 1: 0 - nr_cpus
>> loop 2: 0 - (cpu - 1)
>>
>> If you found one better approach to improve the logic, I believe all the users
>> will appreciate your efforts :-)
>
>Ooh, right, I forgot about the break and then I thought somehow that
>would make it O(N). Sorry about that. I blame jetlag. :)
>
>Yeah, I don't know. The function is quite hairy which makes me keep
>things simpler and reluctant to make changes unless it actually makes
>non-trivial difference. The change looks okay to me but it seems
>neither necessary or substantially beneficial and if my experience is
>anything to go by, *any* change involves some risk of brekage no
>matter how innocent it may look, so given the circumstances, I'd like
>to keep things the way they are.
Yep, I really agree with you. If no big improvement, it is really not
necessary to change the code, which will face some risk.
Here I have another one, which in my mind will improve it in one case. Looking
forward to your comments :-) If I am not correct, please let me know. :-)
>From bd70498b9df47b25ff20054e24bb510c5430c0c3 Mon Sep 17 00:00:00 2001
From: Wei Yang <weiyang@...ux.vnet.ibm.com>
Date: Thu, 10 Oct 2013 09:42:14 +0800
Subject: [PATCH] percpu: optimize group assignment when cpu_distance_fn is
NULL
When cpu_distance_fn is NULL, all CPUs belongs to group 0. The original logic
will continue to go through each CPU and its predecessor. cpu_distance_fn is
always NULL when pcpu_build_alloc_info() is called from pcpu_page_first_chunk().
By applying this patch, the time complexity will drop to O(n) form O(n^2) in
case cpu_distance_fn is NULL.
Signed-off-by: Wei Yang <weiyang@...ux.vnet.ibm.com>
---
mm/percpu.c | 23 ++++++++++++-----------
1 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/mm/percpu.c b/mm/percpu.c
index f79c807..8e6034f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1481,20 +1481,21 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
for_each_possible_cpu(cpu) {
group = 0;
next_group:
- for_each_possible_cpu(tcpu) {
- if (cpu == tcpu)
- break;
- if (group_map[tcpu] == group && cpu_distance_fn &&
- (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
- cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
- group++;
- if (group == nr_groups) {
- nr_groups++;
+ if (cpu_distance_fn)
+ for_each_possible_cpu(tcpu) {
+ if (cpu == tcpu)
break;
+ if (group_map[tcpu] == group &&
+ (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE ||
+ cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) {
+ group++;
+ if (group == nr_groups) {
+ nr_groups++;
+ break;
+ }
+ goto next_group;
}
- goto next_group;
}
- }
group_map[cpu] = group;
group_cnt[group]++;
}
--
1.7.5.4
BTW, this one is based on my previous patch.
>
>Thanks a lot!
>
>--
>tejun
--
Richard Yang
Help you, Help me
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists