linux-kernel - Re: [Discussion v2] Usecases for the per-task latency-nice attribute

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <133845e8-bdad-c0c0-a14b-a511ebb88fd4@linux.ibm.com>
Date:   Mon, 7 Oct 2019 14:16:31 +0530
From:   Parth Shah <parth@...ux.ibm.com>
To:     David Laight <David.Laight@...LAB.COM>,
        "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "patrick.bellasi@...bug.net" <patrick.bellasi@...bug.net>,
        "valentin.schneider@....com" <valentin.schneider@....com>,
        "qais.yousef@....com" <qais.yousef@....com>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "pavel@....cz" <pavel@....cz>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "morten.rasmussen@....com" <morten.rasmussen@....com>,
        "pjt@...gle.com" <pjt@...gle.com>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "tj@...nel.org" <tj@...nel.org>,
        "rafael.j.wysocki@...el.com" <rafael.j.wysocki@...el.com>,
        "daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
        "dhaval.giani@...cle.com" <dhaval.giani@...cle.com>,
        "quentin.perret@....com" <quentin.perret@....com>,
        subhra mazumdar <subhra.mazumdar@...cle.com>,
        "ggherdovich@...e.cz" <ggherdovich@...e.cz>,
        "viresh.kumar@...aro.org" <viresh.kumar@...aro.org>,
        Doug Smythies <dsmythies@...us.net>
Subject: Re: [Discussion v2] Usecases for the per-task latency-nice attribute



On 10/2/19 9:41 PM, David Laight wrote:
> From: Parth Shah
>> Sent: 30 September 2019 11:44
> ...
>> 5> Separating AVX512 tasks and latency sensitive tasks on separate cores
>> ( -Tim Chen )
>> ===========================================================================
>> Another usecase we are considering is to segregate those workload that will
>> pull down core cpu frequency (e.g. AVX512) from workload that are latency
>> sensitive. There are certain tasks that need to provide a fast response
>> time (latency sensitive) and they are best scheduled on cpu that has a
>> lighter load and not have other tasks running on the sibling cpu that could
>> pull down the cpu core frequency.
>>
>> Some users are running machine learning batch tasks with AVX512, and have
>> observed that these tasks affect the tasks needing a fast response.  They
>> have to rely on manual CPU affinity to separate these tasks.  With
>> appropriate latency hint on task, the scheduler can be taught to separate them.
> 
> Has this been diagnosed properly?
> I can't really see how the frequency drop from AVX512 significantly affects latency.
> Most tasks that require low latency probably don't do a lot of work.
> It is much more likely that the latency issues happen because the AVX512 tasks
> are doing very few system calls so can't be pre-empted even by a high priority task.> This 'feature' is hinted by this:
>> 2> TurboSched
>> ( -Parth Shah )
>> ====================
>> TurboSched [2] tries to minimize the number of active cores in a socket by
>> packing an un-important and low-utilization (named jitter) task on an
>> already active core and thus refrains from waking up of a new core if
>> possible.
> 

You are correct as both approach contradict each other in some sense.
But what TurboSched tried to achieve is doing task packing only for the
tasks classified by user as *latency in-sensitive*. Whereas, IIUC, what Tim
proposes here is to not pack *latency sensitive* tasks and I guess that
align with the TurboSched approach as well, isn't it?

Probably @Tim can throw some light on this for better clarification?

> Consider this example of a process that requires low latency (sub 1ms would be good):
> - A hardware interrupt (or timer interrupt) wakes up on thread.
> - When that thread wakes it wakes up other threads that are sleeping.
> - All the threads 'beaver away' for a few ms (processing RTP and other audio).
> - They all sleep for the rest of a 10ms period.
> 
> The affinities are set so each thread runs on a separate cpu, and all are SCHED_RR.
> Now loop all the cpus in userspace (run: while :; do :; done) and see what happens to the latencies.
> You really want the SCHED_RR threads to immediately pre-empt the running processes.
> But I suspect nothing happens until a timer interrupt to the target cpu.
> 

This is a good corner case where scheduler can be optimized further, and
the per-task attribute like the latency-nice can be of some help. Maybe we
can reduce the vslice of a task not having any latency constraints in the
time when any RR/RT tasks are present.

> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>