linux-kernel - Re: [PATCH] sched/fair: Rate limit calls to update_blocked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <366aa93b-ecbf-ac0f-cd9e-3376b20d4929@linux.intel.com>
Date:   Fri, 18 Jun 2021 09:14:07 -0700
From:   Tim Chen <tim.c.chen@...ux.intel.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Qais Yousef <qais.yousef@....com>,
        Joel Fernandes <joel@...lfernandes.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Paul McKenney <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Dietmar Eggeman <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Mel Gorman <mgorman@...e.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        "Uladzislau Rezki (Sony)" <urezki@...il.com>,
        Neeraj upadhyay <neeraj.iitr10@...il.com>,
        Aubrey Li <aubrey.li@...ux.intel.com>
Subject: Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages()
 for NOHZ



On 6/18/21 3:28 AM, Vincent Guittot wrote:

>>
>> The current logic is when a CPU becomes idle, next_balance occur very
>> shortly (usually in the next jiffie) as get_sd_balance_interval returns
>> the next_balance in the next jiffie if the CPU is idle.  However, in
>> reality, I saw most CPUs are 95% busy on average for my workload and
>> a task will wake up on an idle CPU shortly.  So having frequent idle
>> balancing towards shortly idle CPUs is counter productive and simply
>> increase overhead and does not improve performance.
> 
> Just to make sure that I understand your problem correctly:  Your problem is:
> - that we have an ilb happening on the idle CPU and consume cycle

That's right.  The cycles are consumed heavily in update_blocked_averages()
when cgroup is enabled.

> - or that the ilb will pull a task on an idle CPU on which a task will
> shortly wakeup which ends to 2 tasks competing for the same CPU.
> 

Because for the OLTP workload I'm looking at, we have tasks that sleep 
for a short while and wake again very shortly (i.e. the CPU actually
is ~95% busy on average), pulling tasks to such a CPU is really not
helpful to improve overall CPU utilization in the system.  So my
intuition is for such almost fully busy CPU, we should defer load
balancing to it (see prototype patch 3).

Tim