lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 10 Jul 2023 23:37:06 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     K Prateek Nayak <kprateek.nayak@....com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Tim Chen <tim.c.chen@...el.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Abel Wu <wuyun.abel@...edance.com>,
        "Gautham R. Shenoy" <gautham.shenoy@....com>,
        Len Brown <len.brown@...el.com>,
        Chen Yu <yu.chen.surf@...il.com>,
        Yicong Yang <yangyicong@...ilicon.com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 3/4] sched/fair: Calculate the scan depth for idle
 balance based on system utilization

Hi Prateek,

thanks for testing this patch,
On 2023-07-10 at 16:36:47 +0530, K Prateek Nayak wrote:
> Hello Chenyu,
> 
> Thank you for sharing this extended version. Sharing the results from
> testing below.
> 
> tl;dr
> 
> - tbench, netperf and unixbench-spawn see an improvement with ILB_UTIL.
> 
> - schbench (old) sees a regression in tail latency once system is heavily 
>   loaded. DeathStarBench and SPECjbb too see a small drop under those
>   conditions.
> 
> - Rest of the benchmark results do not vary much.
> 
>
[...] 
> I have a couple of theories:
> 
Thanks for the insights, I agree the risk you mentioned below could impact the
performance. Some thoughts below:
> o Either new_idle_balance is failing to find an overloaded busy rq as a
>   result of the limit.
> 
If the system is overloaded, the ilb_util finds a relatively busy rq and pulls from it.
There could be no much difference between a relatively busy rq and the busiest one,
because all rqs are quite busy I suppose?
> o Or, there is a chain reaction where pulling from a loaded rq which is not
>   the most loaded, will lead to more new_idle_balancing attempts which is
>   degrading performance.
Yeah, it could be possible that the ilb_util finds a relatively busy rq, but the
imbalance is not high so ilb decides not pull from it. However the busiest
rq is still waiting for someone to help, and this could trigger idle load
balance more frequently.
> 
> I'll go back and get some data to narrow down the cause. Meanwhile if
> there is any specific benchmark you would like me to run on the test
> system, please do let me know.
> 
Another hints might be that, as Gautham and Peter suggested, we should apply ILB_UTIL
to non-Numa domains. In above patch all the domains has sd_share which could
bring negative impact when accessing/writing cross-node data.
Sorry I did not post the latest version with Numa domain excluded previously as
I was trying to create a protype to further speed up idle load balance by
reusing the statistic suggested by Peter. I will send them out once it is stable.

Thanks again,
Chenyu
> --
> Thanks and Regards,
> Prateek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ