linux-kernel - Re: [PATCH v2][RFC] sched/fair: Change SIS_PROP to search idle CPU based on sum of util

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87541edf7b46c1475f73cf464a9edca932f65da5.camel@linux.intel.com>
Date:   Mon, 14 Mar 2022 10:34:30 -0700
From:   Tim Chen <tim.c.chen@...ux.intel.com>
To:     Chen Yu <yu.c.chen@...el.com>, Abel Wu <wuyun.abel@...edance.com>
Cc:     linux-kernel@...r.kernel.org, Tim Chen <tim.c.chen@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mel Gorman <mgorman@...e.de>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Barry Song <21cnbao@...il.com>,
        Barry Song <song.bao.hua@...ilicon.com>,
        Yicong Yang <yangyicong@...ilicon.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Len Brown <len.brown@...el.com>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Aubrey Li <aubrey.li@...el.com>,
        K Prateek Nayak <kprateek.nayak@....com>
Subject: Re: [PATCH v2][RFC] sched/fair: Change SIS_PROP to search idle CPU
 based on sum of util_avg

On Mon, 2022-03-14 at 20:56 +0800, Chen Yu wrote:
> 
> > 
> > So nr_scan will probably be updated at llc-domain-lb-interval, which
> > is llc_size milliseconds. Since load can be varied a lot during such
> > a period, would this brought accuracy issues?
> > 
> I agree there might be delay in reflecting the latest utilization.
> The sum_util calculated by periodic load balance after 112ms would be
> decay to about 0.5 * 0.5 * 0.5 * 0.7 = 8.75%.
> But consider that this is a server platform, I have an impression that
> the CPU utilization jitter during a small period of time is not a regular
> scenario? It seems to be a trade-off. Checking the util_avg in newidle
> load balance path would be more frequent, but it also brings overhead -
> multiple CPUs write/read the per-LLC shared variable and introduces cache
> false sharing. But to make this more robust, maybe we can add time interval
> control in newidle load balance too.
> 
> 

Also the idea is we allow ourselves to be non-optimal in terms of
scheduling for the short term variations.  But we want to make sure that if
there's a long term trend in the load behavior, the scheduler should
adjust for that.  I think if you see high utilization and CPUs are
all close to fully busy for quite a while, that is a long term trend 
that overwhelms any short load jitters.

Tim