linux-kernel - Re: [RFC PATCH 3/3] sched/fair: Add a per-shard overload flag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZSC+FLLbia6YpSLt@chenyu5-mobl2.ccr.corp.intel.com>
Date:   Sat, 7 Oct 2023 10:10:28 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     David Vernet <void@...ifault.com>
CC:     K Prateek Nayak <kprateek.nayak@....com>,
        <linux-kernel@...r.kernel.org>, <peterz@...radead.org>,
        <mingo@...hat.com>, <juri.lelli@...hat.com>,
        <vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
        <rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
        <bristot@...hat.com>, <vschneid@...hat.com>, <tj@...nel.org>,
        <roman.gushchin@...ux.dev>, <gautham.shenoy@....com>,
        <aaron.lu@...el.com>, <wuyun.abel@...edance.com>,
        <kernel-team@...a.com>
Subject: Re: [RFC PATCH 3/3] sched/fair: Add a per-shard overload flag

Hi David,

On 2023-10-03 at 16:05:11 -0500, David Vernet wrote:
> On Wed, Sep 27, 2023 at 02:59:29PM +0800, Chen Yu wrote:
> > Hi Prateek,
> 
> Hi Chenyu,
> 
> > On 2023-09-27 at 09:53:13 +0530, K Prateek Nayak wrote:
> > > Hello David,
> > > 
> > > Some more test results (although this might be slightly irrelevant with
> > > next version around the corner)
> > > 
> > > On 9/1/2023 12:41 AM, David Vernet wrote:
> > > > On Thu, Aug 31, 2023 at 04:15:08PM +0530, K Prateek Nayak wrote:
> > > > 
> > > -> With EEVDF
> > > 
> > > o tl;dr
> > > 
> > > - Same as what was observed without EEVDF  but shared_runq shows
> > >   serious regression with multiple more variants of tbench and
> > >   netperf now.
> > > 
> > > o Kernels
> > > 
> > > eevdf			: tip:sched/core at commit b41bbb33cf75 ("Merge branch 'sched/eevdf' into sched/core")
> > > shared_runq		: eevdf + correct time accounting with v3 of the series without any other changes
> > > shared_runq_idle_check	: shared_runq + move the rq->avg_idle check before peeking into the shared_runq
> > > 			  (the rd->overload check still remains below the shared_runq access)
> > >
> > 
> > I did not see any obvious regression on a Sapphire Rapids server and it seems that
> > the result on your platform suggests that C/S workload could be impacted
> > by shared_runq. Meanwhile some individual workloads like HHVM in David's environment
> > (no shared resource between tasks if I understand correctly) could benefit from
> 
> Correct, hhvmworkers are largely independent, though they do sometimes
> synchronize, and they also sometimes rely on I/O happening in other
> tasks.
> 
> > shared_runq a lot. This makes me wonder if we can let shared_runq skip the C/S tasks.
> 
> I'm also open to this possibility, but I worry that we'd be going down
> the same rabbit hole as what fair.c does already, which is use
> heuristics to determine when something should or shouldn't be migrated,
> etc. I really do feel that there's value in SHARED_RUNQ providing
> consistent and predictable work conservation behavior.
> 
> On the other hand, it's clear that there are things we can do to improve
> performance for some of these client/server workloads that hammer the
> runqueue on larger CCXs / sockets. If we can avoid those regressions
> while still having reasonably high confidence that work conservation
> won't disproportionately suffer, I'm open to us making some tradeoffs
> and/or adding a bit of complexity to avoid some of this unnecessary
> contention.
> 

Since I did not observe any regression(although I did not test hackbench
yet) on the latest version you sent to me, I'm OK with postponing the
client/server optimization to make the patchset simple, and Prateek
has other proposal to deal with the regression.

> I think it's probably about time for v4 to be sent out. What do you
> folks think about including:
>

It's OK for me and I can launch the test once the latest version is released.

thanks,
Chenyu