[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d755d515-e5d6-b3fc-f7f5-9f8aebcf913a@amd.com>
Date: Wed, 27 Sep 2023 14:06:41 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Chen Yu <yu.c.chen@...el.com>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org,
peterz@...radead.org, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com, tj@...nel.org,
roman.gushchin@...ux.dev, gautham.shenoy@....com,
aaron.lu@...el.com, wuyun.abel@...edance.com, kernel-team@...a.com
Subject: Re: [RFC PATCH 3/3] sched/fair: Add a per-shard overload flag
Hello Chenyu,
On 9/27/2023 12:29 PM, Chen Yu wrote:
> Hi Prateek,
>
> On 2023-09-27 at 09:53:13 +0530, K Prateek Nayak wrote:
>> Hello David,
>>
>> Some more test results (although this might be slightly irrelevant with
>> next version around the corner)
>>
>> On 9/1/2023 12:41 AM, David Vernet wrote:
>>> On Thu, Aug 31, 2023 at 04:15:08PM +0530, K Prateek Nayak wrote:
>>>
>> -> With EEVDF
>>
>> o tl;dr
>>
>> - Same as what was observed without EEVDF but shared_runq shows
>> serious regression with multiple more variants of tbench and
>> netperf now.
>>
>> o Kernels
>>
>> eevdf : tip:sched/core at commit b41bbb33cf75 ("Merge branch 'sched/eevdf' into sched/core")
>> shared_runq : eevdf + correct time accounting with v3 of the series without any other changes
>> shared_runq_idle_check : shared_runq + move the rq->avg_idle check before peeking into the shared_runq
>> (the rd->overload check still remains below the shared_runq access)
>>
>
> I did not see any obvious regression on a Sapphire Rapids server and it seems that
> the result on your platform suggests that C/S workload could be impacted
> by shared_runq. Meanwhile some individual workloads like HHVM in David's environment
> (no shared resource between tasks if I understand correctly) could benefit from
> shared_runq a lot.
Yup that would be my guess too since HHVM seems to benefit purely from
more aggressive work conservation. (unless it leads to some second order
effect)
> This makes me wonder if we can let shared_runq skip the C/S tasks.
> The question would be how to define C/S tasks. At first thought:
> A only wakes up B, and B only wakes up A, then they could be regarded as a pair
> of C/S
> (A->last_wakee == B && B->last_wakee == A &&
> A->wakee_flips <= 1 && B->wakee_flips <= 1)
> But for netperf/tbench, this does not apply, because netperf client leverages kernel
> thread(workqueue) to wake up the netserver, that is A wakes up kthread T, then T
> wakes up B. Unless we have a chain, we can not detect this wakeup behavior.
Yup, unless we have a notion of chain/flow, or until we can somehow
account the wakeup of client using the kthread to the server, this will
be hard to detect.
I can give it a try with the SIS_PAIR condition you shared above. Let
me know.
>
> thanks,
> Chenyu
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists