linux-kernel - Re: [RFC 00/60] Coscheduling for Linux

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5dbc627c-15a7-1c48-cb88-cf60b445dd0b@oracle.com>
Date:   Mon, 29 Oct 2018 15:52:51 -0700
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Jan H. Schönherr <jschoenh@...zon.de>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC 00/60] Coscheduling for Linux


On 10/26/18 4:44 PM, Jan H. Schönherr wrote:
> On 19/10/2018 02.26, Subhra Mazumdar wrote:
>> Hi Jan,
> Hi. Sorry for the delay.
>
>> On 9/7/18 2:39 PM, Jan H. Schönherr wrote:
>>> The collective context switch from one coscheduled set of tasks to another
>>> -- while fast -- is not atomic. If a use-case needs the absolute guarantee
>>> that all tasks of the previous set have stopped executing before any task
>>> of the next set starts executing, an additional hand-shake/barrier needs to
>>> be added.
>>>
>> Do you know how much is the delay? i.e what is overlap time when a thread
>> of new group starts executing on one HT while there is still thread of
>> another group running on the other HT?
> The delay is roughly equivalent to the IPI latency, if we're just talking
> about coscheduling at SMT level: one sibling decides to schedule another
> group, sends an IPI to the other sibling(s), and may already start
> executing a task of that other group, before the IPI is received on the
> other end.
Can you point to where the leader is sending the IPI to other siblings?

I did some experiment and delay seems to be sub microsec. I ran 2 threads
that are just looping in one cosched group and affinitized to the 2 HTs of
a core. And another thread in a different cosched group starts running
affinitized to the first HT of the same core. I time stamped just before
context_switch() in __schedule() for the threads switching from one to
another and one to idle. Following is what I get on cpu 1 and 45 that are
siblings, cpu 1 is where the other thread preempts:

[  403.216625] cpu:45 sub1->idle:403216624579
[  403.238623] cpu:1 sub1->sub2:403238621585
[  403.238624] cpu:45 sub1->idle:403238621787
[  403.260619] cpu:1 sub1->sub2:403260619182
[  403.260620] cpu:45 sub1->idle:403260619413
[  403.282617] cpu:1 sub1->sub2:403282617157
[  403.282618] cpu:45 sub1->idle:403282617317
..

Not sure why the first switch on cpu to idle happened. But then onwards
the difference in timestamps is less than a microsec. This is just a crude
way to get a sense of the delay, may not be exact.

Thanks,
Subhra
>
> Now, there are some things that may delay processing an IPI, but in those
> cases the target CPU isn't executing user code.
>
> I've yet to produce some current numbers for SMT-only coscheduling. An
> older ballpark number I have is about 2 microseconds for the collective
> context switch of one hierarchy level, but take that with a grain of salt.
>
> Regards
> Jan
>