lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 26 Mar 2014 14:34:22 +0100
From:	Alexander Gordeev <agordeev@...hat.com>
To:	linux-kernel@...r.kernel.org
Cc:	Alexander Gordeev <agordeev@...hat.com>,
	Kent Overstreet <kmo@...erainc.com>,
	Jens Axboe <axboe@...nel.dk>, Shaohua Li <shli@...nel.org>,
	Nicholas Bellinger <nab@...ux-iscsi.org>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: [PATCH RFC 0/2] percpu_ida: Take into account CPU topology when stealing tags

Hello,

This series is against 3.14.0-rc7.

It is amied to further improve 'percpu_ida' tags locality by taking
into account system's CPU topology when stealing tags. That is try
to steal from a CPU which is 'closest' to the stealing one.

I would not bother to post this, since on several system the change
did not show any improvement, i.e. on such one:

CPU0 attaching sched-domain:
 domain 0: span 0,8 level SIBLING
  groups: 0 (cpu_power = 588) 8 (cpu_power = 588)
  domain 1: span 0-3,8-11 level MC
   groups: 0,8 (cpu_power = 1176) 1,9 (cpu_power = 1176) 2,10 (cpu_power = 1176) 3,11 (cpu_power = 1176)
   domain 2: span 0-15 level NUMA
    groups: 0-3,8-11 (cpu_power = 4704) 4-7,12-15 (cpu_power = 4704)


But other systems (more dense?) showed increased cache-hit rate
up to 20%, i.e. this one:

CPU5 attaching sched-domain:
 domain 0: span 0-5 level MC
  groups: 5 (cpu_power = 1023) 0 (cpu_power = 1023) 1 (cpu_power = 1023) 2 (cpu_power = 1023) 3 (cpu_power = 1023) 4 (cpu_power = 1023)
  domain 1: span 0-7 level NUMA
   groups: 0-5 (cpu_power = 6138) 6-7 (cpu_power = 2046)
CPU6 attaching sched-domain:
 domain 0: span 6-7 level MC
  groups: 6 (cpu_power = 1023) 7 (cpu_power = 1023)
  domain 1: span 0-7 level NUMA
   groups: 6-7 (cpu_power = 2046) 0-5 (cpu_power = 6138)


I tested using 'null_blk' device with number of threads equal
to the number of CPUs with each thread affined to one CPU and
not affined, with no difference.

Suggestions are welcomed :)

Thanks!

Cc: Kent Overstreet <kmo@...erainc.com>
Cc: Jens Axboe <axboe@...nel.dk>
Cc: Shaohua Li <shli@...nel.org>
Cc: Nicholas Bellinger <nab@...ux-iscsi.org>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>

Alexander Gordeev (2):
  sched: Introduce topology level masks and for_each_tlm() macro
  percpu_ida: Use for_each_tlm() macro for CPU lookup in steal_tags()

 include/linux/percpu_ida.h |    1 -
 include/linux/sched.h      |    5 ++
 kernel/sched/core.c        |   89 ++++++++++++++++++++++++++++++++++++++++++++
 lib/percpu_ida.c           |   46 +++++++++-------------
 4 files changed, 113 insertions(+), 28 deletions(-)

-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ