[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc5730d4-b3ee-1fcc-7f57-824db606734f@oracle.com>
Date: Wed, 1 Nov 2017 01:08:59 -0500
From: Atish Patra <atish.patra@...cle.com>
To: Mike Galbraith <efault@....de>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, joelaf@...gle.com,
brendan.jackman@....com, jbacik@...com, mingo@...hat.com
Subject: Re: [PATCH RFC 1/2] sched: Minimize the idle cpu selection race
window.
On 10/31/2017 03:48 AM, Mike Galbraith wrote:
> On Tue, 2017-10-31 at 09:20 +0100, Peter Zijlstra wrote:
>> On Tue, Oct 31, 2017 at 12:27:41AM -0500, Atish Patra wrote:
>>> Currently, multiple tasks can wakeup on same cpu from
>>> select_idle_sibiling() path in case they wakeup simulatenously
>>> and last ran on the same llc. This happens because an idle cpu
>>> is not updated until idle task is scheduled out. Any task waking
>>> during that period may potentially select that cpu for a wakeup
>>> candidate.
>>>
>>> Introduce a per cpu variable that is set as soon as a cpu is
>>> selected for wakeup for any task. This prevents from other tasks
>>> to select the same cpu again. Note: This does not close the race
>>> window but minimizes it to accessing the per-cpu variable. If two
>>> wakee tasks access the per cpu variable at the same time, they may
>>> select the same cpu again. But it minimizes the race window
>>> considerably.
>> The very most important question; does it actually help? What
>> benchmarks, give what numbers?
Here are the numbers from one of the OLTP configuration on a 8 socket
x86 machine
kernel txn/minute (normalized) user/sys
baseline 1.0 80/5
pcpu 1.021 84/5
The throughput gains are not very high and close to run-to-run variation %.
The schedstat data (added for testing in 2/2 patch) indicates the there
are many instances of the
race conditions that got addressed but may be not enough to trigger a
significant throughput change.
All other benchmark I tested (TPCC, hackbench, schbench, swingbench) did
not show any regression.
I will let Joel post numbers from Android benchmarks.
> I played with something ~similar (cmpxchg() idle cpu reservation)
I had an atomic version earlier as well. Peter's suggestion for per cpu
seems to perform slightly better than atomic.
Thus, this patch has the per cpu version.
> a
> while back in the context of schbench, and it did help that,
Do you have the schbench configuration somewhere that I can test? I
tried various configurations but did not
see any improvement or regression.
> but for
> generic fast mover benchmarks, the added overhead had the expected
> effect, it shaved throughput a wee bit (rob Peter, pay Paul, repeat).
which benchmark ? Is it hackbench or something else ?
I have not found any regression yet in my testing. I would be happy to
test if any other benchmark or different configuration
for hackbench.
Regards,
Atish
> I still have the patch lying about in my rubbish heap, but didn't
> bother to save any of the test results.
>
> -Mike
>
>
Powered by blists - more mailing lists