lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d63d3449-514b-151e-571b-01c8292bff26@oracle.com>
Date:   Thu, 8 Jun 2017 15:06:39 -0700
From:   subhra mazumdar <subhra.mazumdar@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org
Subject: Re: [RFC PATCH] sched: select_idle_core should select least utilized
 core



On 06/08/2017 12:59 PM, Peter Zijlstra wrote:
> On Thu, Jun 08, 2017 at 03:26:32PM -0400, Subhra Mazumdar wrote:
>> Current select_idle_core tries to find a fully idle core and if it fails
>> select_idle_cpu next returns any idle cpu in the llc domain. This is not optimal
>> for architectures with many (more than 2) hyperthreads in a core. This patch
>> changes select_idle_core to find the core with least number of busy
>> hyperthreads and return an idle cpu in that core.
> Yeah, I think not. That makes select_idle_siblings _vastly_ more
> expensive.
I am not sure if the cost will increase vastly. Firstly I removed the
select_idle_cpu for archs that have SMT. For them select_idle_core
(called from select_idle_sibling) should return the final cpu. For archs
w/o SMT there is no select_idle_core and select_idle_cpu will return it.
If there are 8 hyperthreads per core (some existing archs) it is worth
to pay some extra cost to find the most idle core since threads can
run for longer time than the cost paid to search it. Also in the case
where almost all cpus are busy current select_idle_cpu will pay
almost same cost as the new select_idle_core (they will both iterate
almost all cpus). Only for 2 threads/core I can see the cost will
somewhat increase if the system is semi utilized, in that case iterating
all cores will not give anything better. Do you suggest to keep the old
way for 2 threads and find the least idle core for archs with more
hyptherthreads? I ran hackbench at a few points on a x86 socket
with 18 cores and didn't see any statistically significant change in
performance or sys/usr %.

Thanks,
Subhra

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ