lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJWu+ooYjFK3+cLnzKgmchOL_bwdYjvHkOL0de=SixVT_CLy=Q@mail.gmail.com>
Date:   Sat, 4 Nov 2017 17:58:03 -0700
From:   Joel Fernandes <joelaf@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Atish Patra <atish.patra@...cle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Brendan Jackman <brendan.jackman@....com>,
        Josef Bacik <jbacik@...com>, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH RFC 1/2] sched: Minimize the idle cpu selection race window.

Hi Peter,

On Tue, Oct 31, 2017 at 1:20 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, Oct 31, 2017 at 12:27:41AM -0500, Atish Patra wrote:
>> Currently, multiple tasks can wakeup on same cpu from
>> select_idle_sibiling() path in case they wakeup simulatenously
>> and last ran on the same llc. This happens because an idle cpu
>> is not updated until idle task is scheduled out. Any task waking
>> during that period may potentially select that cpu for a wakeup
>> candidate.
>>
>> Introduce a per cpu variable that is set as soon as a cpu is
>> selected for wakeup for any task. This prevents from other tasks
>> to select the same cpu again. Note: This does not close the race
>> window but minimizes it to accessing the per-cpu variable. If two
>> wakee tasks access the per cpu variable at the same time, they may
>> select the same cpu again. But it minimizes the race window
>> considerably.
>
> The very most important question; does it actually help? What
> benchmarks, give what numbers?

I collected some numbers with an Android benchmark called Jankbench.
Most tests didn't show an improvement or degradation with the patch.
However, one of the tests called "list view",  consistently shows an
improvement. Particularly striking is the improvement at mean and 25
percentile.

For list_view test, Jankbench pulls up a list of text and scrolls the
list, this exercises the display pipeline in Android to render and
display the animation as the scroll happens. For Android, lower frame
times is considered quite important as that means we are less likely
to drop frames and give the user a good experience vs a perceivable
poor experience.

For each frame, Jankbench measures the total time a frame takes and
stores it in a DB (the time from which the app starts drawing, to when
the rendering completes and the frame is submitted for display).
Following is the distribution of frame times in ms.

count    16304   (@60 fps, 4.5 minutes)

        Without patch   With patch
mean         5.196633   4.429641 (+14.75%)
std          2.030054   2.310025
25%          5.606810   1.991017 (+64.48%)
50%          5.824013   5.716631 (+1.84%)
75%          5.987102   5.932751 (+0.90%)
95%          6.461230   6.301318 (+2.47%)
99%          9.828959   9.697076 (+1.34%)

Note that although Android uses energy aware scheduling patches, I
turned those off to bring the test as close to mainline as possible. I
also backported Vincent's and Brendan's slow path fixes to the 4.4
kernel that the Pixel 2 uses.

Personally I am in favor of this patch considering this test data but
also that in the past, I remember that our teams had to deal with the
same race issue and used cpusets to avoid it (although they probably
tested with "energy aware" CPU selection kept on).

thanks,

- Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ