lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Nov 2017 01:08:59 -0500
From:   Atish Patra <atish.patra@...cle.com>
To:     Mike Galbraith <efault@....de>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, joelaf@...gle.com,
        brendan.jackman@....com, jbacik@...com, mingo@...hat.com
Subject: Re: [PATCH RFC 1/2] sched: Minimize the idle cpu selection race
 window.



On 10/31/2017 03:48 AM, Mike Galbraith wrote:
> On Tue, 2017-10-31 at 09:20 +0100, Peter Zijlstra wrote:
>> On Tue, Oct 31, 2017 at 12:27:41AM -0500, Atish Patra wrote:
>>> Currently, multiple tasks can wakeup on same cpu from
>>> select_idle_sibiling() path in case they wakeup simulatenously
>>> and last ran on the same llc. This happens because an idle cpu
>>> is not updated until idle task is scheduled out. Any task waking
>>> during that period may potentially select that cpu for a wakeup
>>> candidate.
>>>
>>> Introduce a per cpu variable that is set as soon as a cpu is
>>> selected for wakeup for any task. This prevents from other tasks
>>> to select the same cpu again. Note: This does not close the race
>>> window but minimizes it to accessing the per-cpu variable. If two
>>> wakee tasks access the per cpu variable at the same time, they may
>>> select the same cpu again. But it minimizes the race window
>>> considerably.
>> The very most important question; does it actually help? What
>> benchmarks, give what numbers?
Here are the numbers from one of the OLTP configuration on a 8 socket 
x86 machine
kernel          txn/minute (normalized)    user/sys
baseline      1.0                                          80/5
pcpu            1.021                                      84/5

The throughput gains are not very high and close to run-to-run variation %.
The schedstat data (added for testing in 2/2 patch) indicates the there 
are many instances of the
race conditions that got addressed but may be not enough to trigger a 
significant throughput change.

All other benchmark I tested (TPCC, hackbench, schbench, swingbench) did 
not show any regression.

I will let Joel post numbers from Android benchmarks.
> I played with something ~similar (cmpxchg() idle cpu reservation)
I had an atomic version earlier as well. Peter's suggestion for per cpu 
seems to perform slightly better than atomic.
Thus, this patch has the per cpu version.
>   a
> while back in the context of schbench, and it did help that,
Do you have the schbench configuration somewhere that I can test? I 
tried various configurations but did not
see any improvement or regression.
> but for
> generic fast mover benchmarks, the added overhead had the expected
> effect, it shaved throughput a wee bit (rob Peter, pay Paul, repeat).
which benchmark ? Is it hackbench or something else ?
I have not found any regression yet in my testing. I would be happy to 
test if any other benchmark or different configuration
for hackbench.

Regards,
Atish
> I still have the patch lying about in my rubbish heap, but didn't
> bother to save any of the test results.
>
> 	-Mike
>
>

Powered by blists - more mailing lists