[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200330104415.GF20696@hirez.programming.kicks-ass.net>
Date: Mon, 30 Mar 2020 12:44:15 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Huaixin Chang <changhuaixin@...ux.alibaba.com>
Cc: linux-kernel@...r.kernel.org, shanpeic@...ux.alibaba.com,
yun.wang@...ux.alibaba.com, xlpang@...ux.alibaba.com,
mingo@...hat.com, bsegall@...gle.com, chiluk+linux@...eed.com,
vincent.guittot@...aro.org
Subject: Re: [PATCH v3] sched/fair: Fix race between runtime distribution and
assignment
On Fri, Mar 27, 2020 at 11:26:25AM +0800, Huaixin Chang wrote:
> Currently, there is a potential race between distribute_cfs_runtime()
> and assign_cfs_rq_runtime(). Race happens when cfs_b->runtime is read,
> distributes without holding lock and finds out there is not enough
> runtime to charge against after distribution. Because
> assign_cfs_rq_runtime() might be called during distribution, and use
> cfs_b->runtime at the same time.
>
> Fibtest is the tool to test this race. Assume all gcfs_rq is throttled
> and cfs period timer runs, slow threads might run and sleep, returning
> unused cfs_rq runtime and keeping min_cfs_rq_runtime in their local
> pool. If all this happens sufficiently quickly, cfs_b->runtime will drop
> a lot. If runtime distributed is large too, over-use of runtime happens.
>
> A runtime over-using by about 70 percent of quota is seen when we
> test fibtest on a 96-core machine. We run fibtest with 1 fast thread and
> 95 slow threads in test group, configure 10ms quota for this group and
> see the CPU usage of fibtest is 17.0%, which is far more than the
> expected 10%.
>
> On a smaller machine with 32 cores, we also run fibtest with 96
> threads. CPU usage is more than 12%, which is also more than expected
> 10%. This shows that on similar workloads, this race do affect CPU
> bandwidth control.
>
> Solve this by holding lock inside distribute_cfs_runtime().
>
> Fixes: c06f04c70489 ("sched: Fix potential near-infinite distribute_cfs_runtime() loop")
> Signed-off-by: Huaixin Chang <changhuaixin@...ux.alibaba.com>
> Reviewed-by: Ben Segall <bsegall@...gle.com>
Thanks!
Powered by blists - more mailing lists