linux-kernel - Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5126FC92.9010606@linux.vnet.ibm.com>
Date:	Fri, 22 Feb 2013 13:05:22 +0800
From:	Michael Wang <wangyun@...ux.vnet.ibm.com>
To:	Alex Shi <alex.shi@...el.com>
CC:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	LKML <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>, Paul Turner <pjt@...gle.com>,
	Mike Galbraith <efault@....de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ram Pai <linuxram@...ibm.com>,
	"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
	Namhyung Kim <namhyung@...nel.org>
Subject: Re: [RFC PATCH v3 1/3] sched: schedule balance map foundation

On 02/22/2013 12:46 PM, Alex Shi wrote:
> On 02/22/2013 12:19 PM, Michael Wang wrote:
>>
>>>> Why not seek other way to change O(n^2) to O(n)?
>>>>
>>>> Access 2G memory is unbelievable performance cost.
>> Not access 2G memory, but (2G / 16K) memory, the sbm size is O(N).
>>
>> And please notice that on 16k cpus system, topology will be deep if NUMA
>> enabled (O(log N) as Peter said), and that's really a good stage for
>> this idea to perform on, we could save lot's of recursed 'for' cycles.
>>
> 
> CPU execute part is very very fast compare to the memory access, the
> 'for' cycles cost is most on the memory access for many domain/groups
> data, not instruction execution.
> 
> In a hot patch, several KB memory access will cause clear cpu cache
> pollution then make kernel slowly.

Hmm...that's a good catch.

Comparison between memory access and cpu execution, no doubt the latter
will win, you are right.

But that was same in the old world when access the struct sched_domain,
isn't it?

                for_each_domain(cpu, tmp) {
                        if (weight <= tmp->span_weight)
                                break;
                        if (tmp->flags & sd_flag)
                                sd = tmp;
                }

Both old and new may access data across nodes, but the old one will
access several times more, isn't it?

Regards,
Michael Wang

> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/