linux-kernel - Re: [PATCH] sched/eevdf: Toggle eligibility through sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <218a0365-e933-48fd-a930-ee277d416eac@mailbox.org>
Date:   Mon, 16 Oct 2023 15:33:58 +0200
From:   Tor Vic <torvic9@...lbox.org>
To:     Peter Zijlstra <peterz@...radead.org>,
        Youssef Esmat <youssefesmat@...omium.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, bsegall@...gle.com,
        mingo@...nel.org, vincent.guittot@...aro.org,
        juri.lelli@...hat.com, dietmar.eggemann@....com,
        rostedt@...dmis.org, mgorman@...e.de, bristot@...hat.com,
        corbet@....net, qyousef@...alina.io, chris.hyser@...cle.com,
        patrick.bellasi@...bug.net, pjt@...gle.com, pavel@....cz,
        qperret@...gle.com, tim.c.chen@...ux.intel.com, joshdon@...gle.com,
        timj@....org, kprateek.nayak@....com, yu.c.chen@...el.com,
        joel@...lfernandes.org, efault@....de, tglx@...utronix.de,
        wuyun.abel@...edance.com
Subject: Re: [PATCH] sched/eevdf: Toggle eligibility through sched_feat



On 10/15/23 12:44, Peter Zijlstra wrote:
> On Thu, Oct 12, 2023 at 10:02:13PM -0500, Youssef Esmat wrote:
>> Interactive workloads see performance gains by disabling eligibility
>> checks (EEVDF->EVDF). Disabling the checks reduces the number of
>> context switches and delays less important work (higher deadlines/nice
>> values) in favor of more important work (lower deadlines/nice values).
>>
>> That said, that can add large latencies for some work loads and as the
>> default is eligibility on, but allowing it to be turned off when
>> beneficial.
>>
>> Signed-off-by: Youssef Esmat <youssefesmat@...omium.org>
>> Link: https://lore.kernel.org/lkml/CA+q576MS0-MV1Oy-eecvmYpvNT3tqxD8syzrpxQ-Zk310hvRbw@mail.gmail.com/
>> ---
>>   kernel/sched/fair.c     | 3 +++
>>   kernel/sched/features.h | 1 +
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index a751e552f253..16106da5a354 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -728,6 +728,9 @@ int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>   	s64 avg = cfs_rq->avg_vruntime;
>>   	long load = cfs_rq->avg_load;
>>   
>> +	if (!sched_feat(ENFORCE_ELIGIBILITY))
>> +		return 1;
>> +
>>   	if (curr && curr->on_rq) {
>>   		unsigned long weight = scale_load_down(curr->load.weight);
>>   
> 
> Right.. could you pretty please try:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/eevdf
> 
> as of yesterday or so.
> 
> It defaults to (EEVDF relevant features):
> 
> SCHED_FEAT(PLACE_LAG, true)
> SCHED_FEAT(PLACE_DEADLINE_INITIAL, true)
> SCHED_FEAT(PREEMPT_SHORT, true)
> SCHED_FEAT(PLACE_SLEEPER, false)
> SCHED_FEAT(GENTLE_SLEEPER, true)
> SCHED_FEAT(EVDF, false)
> SCHED_FEAT(DELAY_DEQUEUE, true)
> SCHED_FEAT(GENTLE_DELAY, true)
> 
> If that doesn't do well enough, could you please try, in order of
> preference:
> 
> 2) NO_GENTLE_DELAY
> 3) NO_DELAY_DEQUEUE, PLACE_SLEEPER
> 4) NO_DELAY_DEQUEUE, PLACE_SLEEPER, NO_GENTLE_SLEEPER

I'm very interested in this scheduler stuff, but I know nothing about 
the code.

Still, I ran some very quick benchmarks on a dual-core Skylake laptop 
running 6.6-rc6.
Base slice is 5 ms.

1) Without the recent patches from Peter's tree
2) With patches, default features
3) With patches, NO_GENTLE_DELAY
4) With patches, NO_DELAY_DEQUEUE + PLACE_SLEEPER
5) With patches, like 4) + NO_GENTLE_SLEEPER
6) With patches, like 5) + EVDF

   $ perf stat -r 7 -e cs,migrations,cache-misses,branch-misses -- perf 
bench sched messaging -g 20 -l 1000 -p

test  | seconds | cs   | migrations | cache miss | branch miss |
------|---------|------|------------|------------|-------------|
1)    |  2,90   | 192K |   6,7K     |   99M      |   60M       |
2)    |  2,97   | 226K |   6,9K     |  102M      |   61M       |
3)    |  3,00   | 247K |   6,9K     |  108M      |   62M       |
4)    |  2,92   | 182K |   7,2K     |  101M      |   60M       |
5)    |  2,94   | 203K |   6,8K     |  101M      |   60M       |
6)    |  2,79   |  84K |   6,4K     |   94M      |   57M       |


   $ stress-ng --bsearch 2 --matrix 2 --matrix-method prod --timeout 30 
--metrics-brief [results in bogo ops/s]

test  | bsearch | matrix |
------|---------|--------|
1)    |  392    |  588   |
2)    |  512    |  688   |
3)    |  512    |  663   |
4)    |  512    |  688   |
5)    |  511    |  686   |
6)    |  510    |  655   |

--

I don't know if this info is useful enough for you scheduler people, but 
I hope it helps.

Cheers,
Tor

> 
> I really don't like the EVDF option, and I think you'll end up
> regretting using it sooner rather than later, just to make this one
> benchmark you have happy.
> 
> I'm hoping the default is enough, but otherwise any of the above should
> be a *much* better scheduler.
> 
> Also, bonus points if you can create us a stand alone benchmark that
> captures your metric (al-la facebook's schbench) without the whole
> chrome nonsense, that'd be epic.
>