[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191121123804.GR4097@hirez.programming.kicks-ass.net>
Date: Thu, 21 Nov 2019 13:38:04 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: YT Chang <yt.chang@...iatek.com>
Cc: Matthias Brugger <matthias.bgg@...il.com>,
wsd_upstream@...iatek.com, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org,
Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [PATCH 1/1] sched: cfs_rq h_load might not update due to irq
disable
On Thu, Nov 21, 2019 at 04:30:09PM +0800, YT Chang wrote:
> Syndrome:
>
> Two CPUs might do idle balance in the same time.
> One CPU does idle balance and pulls some tasks.
> However before pick next task, ALL task are pulled back to other CPU.
> That results in infinite loop in both CPUs.
Can you easily reproduce this?
> =========================================
> code flow:
>
> in pick_next_task_fair()
>
> again:
>
> if nr_running == 0
> goto idle
> pick next task
> return
>
> idle:
> idle_balance
> /* pull some tasks from other CPU,
> * However other CPU are also do idle balance,
> * and pull back these task */
>
> go to again
>
> =========================================
> The result to pull ALL tasks back when the task_h_load
> is incorrect and too low.
Clearly you're not running a PREEMPT kernel, otherwise the break in
detach_tasks() would've saved you, right?
> static unsigned long task_h_load(struct task_struct *p)
> {
> struct cfs_rq *cfs_rq = task_cfs_rq(p);
>
> update_cfs_rq_h_load(cfs_rq);
> return div64_ul(p->se.avg.load_avg_contrib * cfs_rq->h_load,
> cfs_rq->runnable_load_avg + 1);
> }
>
> The cfs_rq->h_load is incorrect and might too small.
> The original idea of cfs_rq::last_h_load_update will not
> update cfs_rq::h_load more than once a jiffies.
> When the Two CPUs pull each other in the pick_next_task_fair,
> the irq disabled and result in jiffie not update.
> (Other CPUs wait for runqueue lock locked by the two CPUs.
> So, ALL CPUs are irq disabled.)
This cannot be true; because the loop drops rq->lock, so other CPUs
should have an opportunity to acquire the lock and make progress.
> Solution:
> cfs_rq h_load might not update due to irq disable
> use sched_clock instead jiffies
>
> Signed-off-by: YT Chang <yt.chang@...iatek.com>
> ---
> kernel/sched/fair.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 83ab35e..231c53f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7578,9 +7578,11 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
> {
> struct rq *rq = rq_of(cfs_rq);
> struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
> - unsigned long now = jiffies;
> + u64 now = sched_clock_cpu(cpu_of(rq));
> unsigned long load;
>
> + now = now * HZ >> 30;
> +
> if (cfs_rq->last_h_load_update == now)
> return;
>
This is disguisting and wrong. That is not the correct relation between
sched_clock() and jiffies.
Powered by blists - more mailing lists