[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ac08ae7b72eb7feb39d424ac7f56ce558c808d2b.camel@linux.intel.com>
Date: Thu, 18 Jul 2024 12:42:37 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: "zhaowenhui (A)" <zhaowenhui8@...wei.com>, Ingo Molnar
<mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Juri Lelli
<juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
<mgorman@...e.de>, Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>, "open list:SCHEDULER"
<linux-kernel@...r.kernel.org>
Cc: zhangqiao22@...wei.com, tanghui20@...wei.com
Subject: Re: [BUG REPORT] sched/rt: Inaccurate numerical calculation in
rt_runtime_us constraints
On Thu, 2024-07-18 at 21:02 +0800, zhaowenhui (A) wrote:
> Hello,
> Recently, we find that the cgroup rt_runtime_us's constraints is not
> precise enough in some cases. For example:
>
> (1)
> create a father cgroup and a child cgroup, and we exec:
> echo 1048577 > /sys/fs/cgroup/cpu/father/cpu.rt_period_us
> echo 1048577 > /sys/fs/cgroup/cpu/father/child/cpu.rt_period_us
> echo 0 > /sys/fs/cgroup/cpu/father/cpu.rt_runtime_us
> echo 1 > /sys/fs/cgroup/cpu/father/child/cpu.rt_runtime_us
>
> (2)
> create a father cgroup and two child cgroups, and we exec:
> echo 20000 > /sys/fs/cgroup/cpu/father/cpu.rt_runtime_us
> echo 10000 > /sys/fs/cgroup/cpu/father/child1/cpu.rt_runtime_us
> echo 10001 > /sys/fs/cgroup/cpu/father/child2/cpu.rt_runtime_us
> 1048577
> Logically speaking, the sum of child cgroups' rt_runtime_us should be
> less than the fater's rt_runtime_us, but actually both cases above would
> work. Because in to_ratio(), "div64_u64(runtime << BW_SHIFT, period)"
> ignores the remainders. So if the rt_period_us is big or many child
> cgroups' remainders are ignored, it could happen.
>
> But after all, it doesn't damage a lot, and seems not so easy to fix. So
> I report this and see what can we do about it.
The loss in precision is about 1/(1<<BW_SHIFT), roughly 1 part per million.
So unless you have tens of thousands of rt cgroups, over-allowing 1 part
per million bandwidth per cgroup is probably not a practical concern.
Is there an actual scenario you encountered where this becomes a problem?
Tim
>
> ---
> Regards
> Zhao Wenhui=
>
Powered by blists - more mailing lists