lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0ac6a3ec-84af-1868-936c-1ccc0d401af8@oracle.com>
Date:   Mon, 5 Nov 2018 15:08:16 -0500
From:   Steven Sistare <steven.sistare@...cle.com>
To:     Subhra Mazumdar <subhra.mazumdar@...cle.com>, mingo@...hat.com,
        peterz@...radead.org
Cc:     dhaval.giani@...cle.com, daniel.m.jordan@...cle.com,
        pavel.tatashin@...rosoft.com, matt@...eblueprint.co.uk,
        umgwanakikbuti@...il.com, riel@...hat.com, jbacik@...com,
        juri.lelli@...hat.com, linux-kernel@...r.kernel.org,
        valentin.schneider@....com, vincent.guittot@...aro.org,
        quentin.perret@....com
Subject: Re: [PATCH 00/10] steal tasks to improve CPU utilization

On 11/2/2018 7:39 PM, Subhra Mazumdar wrote:
> On 10/22/18 7:59 AM, Steve Sistare wrote:
>> When a CPU has no more CFS tasks to run, and idle_balance() fails to
>> find a task, then attempt to steal a task from an overloaded CPU in the
>> same LLC. Maintain and use a bitmap of overloaded CPUs to efficiently
>> identify candidates.  To minimize search time, steal the first migratable
>> task that is found when the bitmap is traversed.  For fairness, search
>> for migratable tasks on an overloaded CPU in order of next to run.
>>
>> This simple stealing yields a higher CPU utilization than idle_balance()
>> alone, because the search is cheap, so it may be called every time the CPU
>> is about to go idle.  idle_balance() does more work because it searches
>> widely for the busiest queue, so to limit its CPU consumption, it declines
>> to search if the system is too busy.  Simple stealing does not offload the
>> globally busiest queue, but it is much better than running nothing at all.
>>
>> The bitmap of overloaded CPUs is a new type of sparse bitmap, designed to
>> reduce cache contention vs the usual bitmap when many threads concurrently
>> set, clear, and visit elements.
>>
> Is the bitmask saving much? I tried a simple stealing that just starts
> searching the domain from the current cpu and steals a thread from the
> first cpu that has more than one runnable thread. It seems to perform
> similar to your patch.
> 
> hackbench on X6-2: 2 sockets * 22 cores * 2 hyperthreads = 88 CPUs
>                 baseline        %stdev  patch %stdev
> 1(40 tasks)     0.5524          2.36    0.5522 (0.045%) 3.82
> 2(80 tasks)     0.6482          11.4    0.7241 (-11.7%) 20.34
> 4(160 tasks)    0.9756          0.95    0.8276 (15.1%) 5.8
> 8(320 tasks)    1.7699          1.62    1.6655 (5.9%) 1.57
> 16(640 tasks)   3.1018          0.77    2.9858 (3.74%) 1.4
> 32(1280 tasks)  5.565           0.62    5.3388 (4.1%) 0.72
> 
> X6-2: 2 sockets * 22 cores * 2 hyperthreads = 88 CPUs
> Oracle database OLTP, logging _enabled_
> 
> Users %speedup
> 20 1.2
> 40 -0.41
> 60 0.83
> 80 2.37
> 100 1.54
> 120 3.0
> 140 2.24
> 160 1.82
> 180 1.94
> 200 2.23
> 220 1.49
Hi Subhra,
  The bitset is a few percent faster than iterating over CPUs in the
tests I ran on the X6-2 with 44 CPUs per node.  If we extend stealing to 
RT, folks care even more about a few percent. The difference should be 
greater on systems with more CPUs per socket, and greater if we extend 
stealing to steal across NUMA nodes, and greater if Valentin adds another
bitset for misfits. Lastly, there is no measurable downside in maintaining 
the overloaded CPUs bitset.  I ran experiments where I set and cleared the 
bits in overload_set and overload_clear, but disabled stealing itself, and 
saw no significant difference versus the baseline.

- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ