[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10f42efa-3750-491a-74fe-d84c9c4924e3@oracle.com>
Date: Wed, 19 Feb 2020 12:16:59 -0500
From: chris hyser <chris.hyser@...cle.com>
To: David Laight <David.Laight@...LAB.COM>,
Parth Shah <parth@...ux.ibm.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"patrick.bellasi@...bug.net" <patrick.bellasi@...bug.net>,
"valentin.schneider@....com" <valentin.schneider@....com>,
"dhaval.giani@...cle.com" <dhaval.giani@...cle.com>,
"dietmar.eggemann@....com" <dietmar.eggemann@....com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"qais.yousef@....com" <qais.yousef@....com>,
"pavel@....cz" <pavel@....cz>,
"qperret@...rret.net" <qperret@...rret.net>,
"pjt@...gle.com" <pjt@...gle.com>, "tj@...nel.org" <tj@...nel.org>
Subject: Re: [PATCH v3 0/3] Introduce per-task latency_nice for scheduler
hints
On 2/19/20 6:18 AM, David Laight wrote:
> From: chris hyser
>> Sent: 18 February 2020 23:00
> ...
>> All, I was asked to take a look at the original latency_nice patchset.
>> First, to clarify objectives, Oracle is not
>> interested in trading throughput for latency.
>> What we found is that the DB has specific tasks which do very little but
>> need to do this as absolutely quickly as possible, ie extreme latency
>> sensitivity. Second, the key to latency reduction
>> in the task wakeup path seems to be limiting variations of "idle cpu" search.
>> The latter particularly interests me as an example of "platform size
>> based latency" which I believe to be important given all the varying size
>> VMs and containers.
>
> From my experiments there are a few things that seem to affect latency
> of waking up real time (sched fifo) tasks on a normal kernel:
Sorry. I was only ever talking about sched_other as per the original patchset. I realize the term extreme latency
sensitivity may have caused confusion. What that means to DB people is no doubt different than audio people. :-)
>
> 1) The time taken for the (intel x86) cpu to wakeup from monitor/mwait.
> If the cpu is allowed to enter deeper sleep states this can take 900us.
> Any changes to this are system-wide not process specific.
>
> 2) If the cpu an RT process last ran on (ie the one it is woken on) is
> running in kernel, the process switch won't happen until cond_reshed()
> is called.
> On my system the code to flush the display frame buffer takes 3.3ms.
> Compiling a kernel with CONFIG_PREEMPT=y will reduce this.
>
> 3) If a hardware interrupt happens just after the process is woken
> then you have to wait until it finishes and any 'softint' work
> that is scheduled on the same cpu finishes.
> The ethernet driver transmit completions an receive ring filling
> can easily take 1ms.
> Booting with 'threadirq' might help this.
>
> 4) If you need to acquire a lock/futex then you need to allow for the
> process that holds it being delayed by a hardware interrupt (etc).
> So even if the lock is only held for a few instructions it can take
> a long time to acquire.
> (I need to change some linked lists to arrays indexed by an atomically
> incremented global index.)
>
> FWIW I can't imagine how a database can have anything that is that
> latency sensitive.
> We are doing lots of channels of audio processing and have a lot of work
> to do within 10ms to avoid audible errors.
There are existing internal numbers that I will ultimately have to duplicate that show that simply short-cutting these
idle cpu searches has a significant benefit on DB performance on large hardware. However that was for a different
patchset involving things I don't like so I'm still exploring how to achieve similar results within the latency_nice
framework.
-chrish
Powered by blists - more mailing lists