[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67eeb47c-ae23-1389-bb52-f9cfb3206741@arm.com>
Date: Thu, 30 Mar 2023 15:34:02 +0200
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Waiman Long <longman@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Qais Yousef <qyousef@...alina.io>, Tejun Heo <tj@...nel.org>,
Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Hao Luo <haoluo@...gle.com>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
Steven Rostedt <rostedt@...dmis.org>,
luca.abeni@...tannapisa.it, claudio@...dence.eu.com,
tommaso.cucinotta@...tannapisa.it, bristot@...hat.com,
mathieu.poirier@...aro.org,
Vincent Guittot <vincent.guittot@...aro.org>,
Wei Wang <wvw@...gle.com>, Rick Yiu <rickyiu@...gle.com>,
Quentin Perret <qperret@...gle.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Sudeep Holla <sudeep.holla@....com>
Subject: Re: [PATCH 6/7] cgroup/cpuset: Protect DL BW data against parallel
cpuset_attach()
On 29/03/2023 18:02, Waiman Long wrote:
> It is possible to have parallel attach operations to the same cpuset in
> progress. To avoid possible corruption of single set of DL BW data in
> the cpuset structure, we have to disallow parallel attach operations if
> DL tasks are present. Attach operations can still proceed in parallel
> as long as no DL tasks are involved.
>
> This patch also stores the CPU where DL BW is allocated and free that BW
> back to the same CPU in case cpuset_can_attach() is called.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
> kernel/cgroup/cpuset.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 05c0a1255218..555a6b1a2b76 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -199,6 +199,7 @@ struct cpuset {
> */
> int nr_deadline_tasks;
> int nr_migrate_dl_tasks;
> + int dl_bw_cpu;
Like I mentioned in
https://lkml.kernel.org/r/cdede77a-5dc5-8933-a444-a2046b074b12@arm.com
IMHO any CPU of the cpuset is fine since exclusive cpuset and related
root_domain (as the container for DL BW accounting data) are congruent
in terms of cpumask.
> u64 sum_migrate_dl_bw;
>
> /* Invalid partition error code, not lock protected */
> @@ -2502,6 +2503,16 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
> if (cpumask_empty(cs->effective_cpus))
> goto out_unlock;
>
> + /*
> + * If there is another parallel attach operations in progress for
> + * the same cpuset, the single set of DL data there may get
> + * incorrectly overwritten. So parallel operations are not allowed
> + * if DL tasks are present.
> + */
> + ret = -EBUSY;
> + if (cs->nr_migrate_dl_tasks)
> + goto out_unlock;
(1)
> cgroup_taskset_for_each(task, css, tset) {
> ret = task_can_attach(task);
> if (ret)
> @@ -2511,6 +2522,9 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
> goto out_unlock;
>
> if (dl_task(task)) {
> + if (cs->attach_in_progress)
> + goto out_unlock;
(2)
Just to check if I get this right, 2 bail-out conditions are necessary
because:
(1) is to prevent any new cs attach if there is already a DL cs attach
and (2) is to prevent a new DL cs attach if there is already a non-DL cs
attach.
> cs->nr_migrate_dl_tasks++;
> cs->sum_migrate_dl_bw += task->dl.dl_bw;
> }
[...]
Powered by blists - more mailing lists