[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <172eaa76-0ee9-8c7d-df33-77e9696bb236@amd.com>
Date: Mon, 29 Jul 2024 13:18:12 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Chuyi Zhou <zhouchuyi@...edance.com>
CC: <chengming.zhou@...ux.dev>, <linux-kernel@...r.kernel.org>,
<tj@...nel.org>, <hannes@...xchg.org>, <mkoutny@...e.com>,
<surenb@...gle.com>, <peterz@...radead.org>
Subject: Re: [PATCH] psi: Inherit parent cgroup psi enable state
Hello Chuyi,
On 7/29/2024 12:01 PM, Chuyi Zhou wrote:
> Hello,
>
> 在 2024/7/29 12:45, K Prateek Nayak 写道:
>> Hello Chuyi,
>>
>> On 7/29/2024 9:11 AM, Chuyi Zhou wrote:
>>> Currently when a parent cgroup disables psi through cgroup.pressure, newly
>>> created child cgroups do not inherit the psi state of the parent cgroup.
>>>
>>> This patch tries to solve this issue. When a child cgroup is created, it
>>> would inherit the psi enabled state of the parent in group_init().
>>> Once the enable state is found to be false in the css_populate_dir(), the
>>> {cpu, io, memory}.pressure files will be hidden using cgroup_file_show().
>>>
>>> Signed-off-by: Chuyi Zhou <zhouchuyi@...edance.com>
>>> ---
>>> kernel/cgroup/cgroup.c | 21 +++++++++++++++++++--
>>> kernel/sched/psi.c | 4 ++--
>>> 2 files changed, 21 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>>> index c8e4b62b436a4..775fe528efcad 100644
>>> --- a/kernel/cgroup/cgroup.c
>>> +++ b/kernel/cgroup/cgroup.c
>>> @@ -1719,6 +1719,24 @@ static void css_clear_dir(struct cgroup_subsys_state *css)
>>> }
>>> }
>>> +static int populate_psi_files(struct cgroup_subsys_state *css)
>>> +{
>>> + struct cgroup *cgrp = css->cgroup;
>>> + int ret, i;
>>> +
>>> + ret = cgroup_addrm_files(css, cgrp, cgroup_psi_files, true);
>>> + if (ret < 0)
>>> + return ret;
>>> +
>>> + if (cgrp->psi && !cgrp->psi->enabled) {
>>> + for (i = 0; i < NR_PSI_RESOURCES; i++)
>>> + cgroup_file_show(&cgrp->psi_files[i], 0);
>>> + }
>>> +
>>> + return ret;
>>> +}
>>> +
>>> +
>>> /**
>>> * css_populate_dir - create subsys files in a cgroup directory
>>> * @css: target css
>>> @@ -1742,8 +1760,7 @@ static int css_populate_dir(struct cgroup_subsys_state *css)
>>> return ret;
>>> if (cgroup_psi_enabled()) {
>>> - ret = cgroup_addrm_files(css, cgrp,
>>> - cgroup_psi_files, true);
>>> + ret = populate_psi_files(css);
>>> if (ret < 0) {
>>> cgroup_addrm_files(css, cgrp,
>>> cgroup_base_files, false);
>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>> index 020d58967d4e8..d0aa17b368819 100644
>>> --- a/kernel/sched/psi.c
>>> +++ b/kernel/sched/psi.c
>>> @@ -180,7 +180,7 @@ static void group_init(struct psi_group *group)
>>> {
>>> int cpu;
>>> - group->enabled = true;
>>> + group->enabled = group->parent ? group->parent->enabled : true;
>>
>> Since this is only the init path, if the user later enables PSI
>> accounting for a parent, should it not re-evaluate it for the groups
>> down the hierarchy?
>>
>> Looking at "cgroup_pressure_write()", I could not spot it calling
>> "css_populate_dir()". Should it not walk the hierarchy and do a
>> "cgroup_file_show()" considering the changes in you patch?
>>
>> (P.S. I'm not too familiar with this piece of code so please do let me
>> know if I missed something obvious)
>
> Perhaps my description in the commit log was not clear enough. This patch is intended to make child cgroups inherit the state of the parent node *during initialization*.
Ah! i see. Thank you for clarifying :)
> The cgroup.pressure interface remains the same as before, only changing the enable state at the current level.
>
> In production environments, the overhead of PSI could be significant on some machines (with many deep levels of cgroups). For certain tasks (such as /sys/fs/cgroup/offline/pod_xxx), we may not need to monitor their PSI metrics. With this patch, after disabling /sys/fs/cgroup/offline/cgroup.pressure, any subsequently deployed offline pods will not enable PSI. Although users can disable PSI by traversing all cgroups under /sys/fs/cgroup/offline/, this may not be convenient enough.
>
> Thanks.
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists