linux-kernel - Re: [PATCH] psi: Inherit parent cgroup psi enable state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <172eaa76-0ee9-8c7d-df33-77e9696bb236@amd.com>
Date: Mon, 29 Jul 2024 13:18:12 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Chuyi Zhou <zhouchuyi@...edance.com>
CC: <chengming.zhou@...ux.dev>, <linux-kernel@...r.kernel.org>,
	<tj@...nel.org>, <hannes@...xchg.org>, <mkoutny@...e.com>,
	<surenb@...gle.com>, <peterz@...radead.org>
Subject: Re: [PATCH] psi: Inherit parent cgroup psi enable state

Hello Chuyi,

On 7/29/2024 12:01 PM, Chuyi Zhou wrote:
> Hello,
> 
> 在 2024/7/29 12:45, K Prateek Nayak 写道:
>> Hello Chuyi,
>>
>> On 7/29/2024 9:11 AM, Chuyi Zhou wrote:
>>> Currently when a parent cgroup disables psi through cgroup.pressure, newly
>>> created child cgroups do not inherit the psi state of the parent cgroup.
>>>
>>> This patch tries to solve this issue. When a child cgroup is created, it
>>> would inherit the psi enabled state of the parent in group_init().
>>> Once the enable state is found to be false in the css_populate_dir(), the
>>> {cpu, io, memory}.pressure files will be hidden using cgroup_file_show().
>>>
>>> Signed-off-by: Chuyi Zhou <zhouchuyi@...edance.com>
>>> ---
>>>   kernel/cgroup/cgroup.c | 21 +++++++++++++++++++--
>>>   kernel/sched/psi.c     |  4 ++--
>>>   2 files changed, 21 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>>> index c8e4b62b436a4..775fe528efcad 100644
>>> --- a/kernel/cgroup/cgroup.c
>>> +++ b/kernel/cgroup/cgroup.c
>>> @@ -1719,6 +1719,24 @@ static void css_clear_dir(struct cgroup_subsys_state *css)
>>>       }
>>>   }
>>> +static int populate_psi_files(struct cgroup_subsys_state *css)
>>> +{
>>> +    struct cgroup *cgrp = css->cgroup;
>>> +    int ret, i;
>>> +
>>> +    ret = cgroup_addrm_files(css, cgrp, cgroup_psi_files, true);
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>> +    if (cgrp->psi && !cgrp->psi->enabled) {
>>> +        for (i = 0; i < NR_PSI_RESOURCES; i++)
>>> +            cgroup_file_show(&cgrp->psi_files[i], 0);
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +
>>>   /**
>>>    * css_populate_dir - create subsys files in a cgroup directory
>>>    * @css: target css
>>> @@ -1742,8 +1760,7 @@ static int css_populate_dir(struct cgroup_subsys_state *css)
>>>                   return ret;
>>>               if (cgroup_psi_enabled()) {
>>> -                ret = cgroup_addrm_files(css, cgrp,
>>> -                             cgroup_psi_files, true);
>>> +                ret = populate_psi_files(css);
>>>                   if (ret < 0) {
>>>                       cgroup_addrm_files(css, cgrp,
>>>                                  cgroup_base_files, false);
>>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>>> index 020d58967d4e8..d0aa17b368819 100644
>>> --- a/kernel/sched/psi.c
>>> +++ b/kernel/sched/psi.c
>>> @@ -180,7 +180,7 @@ static void group_init(struct psi_group *group)
>>>   {
>>>       int cpu;
>>> -    group->enabled = true;
>>> +    group->enabled = group->parent ? group->parent->enabled : true;
>>
>> Since this is only the init path, if the user later enables PSI
>> accounting for a parent, should it not re-evaluate it for the groups
>> down the hierarchy?
>>
>> Looking at "cgroup_pressure_write()", I could not spot it calling
>> "css_populate_dir()". Should it not walk the hierarchy and do a
>> "cgroup_file_show()" considering the changes in you patch?
>>
>> (P.S. I'm not too familiar with this piece of code so please do let me
>>   know if I missed something obvious)
> 
> Perhaps my description in the commit log was not clear enough. This patch is intended to make child cgroups inherit the state of the parent node *during initialization*.

Ah! i see. Thank you for clarifying :)

> The cgroup.pressure interface remains the same as before, only changing the enable state at the current level.
> 
> In production environments, the overhead of PSI could be significant on some machines (with many deep levels of cgroups). For certain tasks (such as /sys/fs/cgroup/offline/pod_xxx), we may not need to monitor their PSI metrics. With this patch, after disabling /sys/fs/cgroup/offline/cgroup.pressure, any subsequently deployed offline pods will not enable PSI. Although users can disable PSI by traversing all cgroups under /sys/fs/cgroup/offline/, this may not be convenient enough.
> 
> Thanks.

-- 
Thanks and Regards,
Prateek