lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 Sep 2013 04:39:30 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Mike Galbraith <bitbucket@...ine.de>,
	Ingo Molnar <mingo@...nel.org>, Rik van Riel <riel@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] sched: Avoid select_idle_sibling() for wake_affine(.sync=true)

On Thu, Sep 26, 2013 at 4:16 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Thu, Sep 26, 2013 at 03:55:55AM -0700, Paul Turner wrote:
>> > +               /*
>> > +                * Don't bother with select_idle_sibling() in the case of a sync wakeup
>> > +                * where we know the only running task will soon go-away. Going
>> > +                * through select_idle_sibling will only lead to pointless ping-pong.
>> > +                */
>> > +               if (sync && prev_cpu == cpu && cpu_rq(cpu)->nr_running == 1 &&
>>
>> I've long thought of trying something like this.
>>
>> I like the intent but I'd go a step further in that I think we want to
>> also implicitly extract WF_SYNC itself.
>
> I have vague memories of actually trying something like that a good
> number of years ago.. sadly that's all I remember about it.
>
>> What we really then care about is predicting the overlap associated
>> with userspace synchronization objects, typically built on top of
>> futexes.  Unfortunately the existence/use of per-thread futexes
>> reduces how much state you could usefully associate with the futex.
>> One approach might be to hash (with some small saturating counter)
>> against rip.  But this gets more complicated quite quickly.
>
> Why would you need per object storage? To further granulate the
> predicted overlap? instead of having one per task, you have one per
> object?

It is my intuition that there are a few common objects with fairly
polarized behavior:  I.e. For condition variables and producer
consumer queues, a wakeup strongly predicts blocking.  Whereas for
locks protecting objects, e.g. a Mutex, would be expected to have the
opposite behavior.

For this hint to be beneficial you have to get it right frequently,
getting it wrong in the first case hurts cache and in the second hurts
parallelism.  Today we always err on the side of hurting locality
since the cost of getting it wrong is better bounded.  These are
sufficiently common, and likely to be interspersed, that I suspect
allowing them to interact on a thread-wide counter will basically give
a mush result (or even be an anti predictor since it will strongly
favor the last observation) an an input.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ