linux-kernel - Re: [RFC PATCH 2/2] sched/eevdf: Introduce a cgroup interface for slice

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cbc64062-f657-4163-9da2-6ed7414d20a7@linux.alibaba.com>
Date: Tue, 29 Oct 2024 14:49:51 +0800
From: Tianchen Ding <dtcccc@...ux.alibaba.com>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>
Subject: Re: [RFC PATCH 2/2] sched/eevdf: Introduce a cgroup interface for
 slice

On 2024/10/29 14:18, Tejun Heo wrote:
>> So This patch is trying to introduce a cgroup level interface.
> 
> If I'm reading the code correctly, the property can be set per task and is
> inherited when forking unless RESET_ON_FORK is set. I'm not sure the cgroup
> interface adds all that much:
> 
> - There's no inherent hierarchical or grouping behavior. I don't think it
>    makes sense for cgroup config to override per-thread configs.
> 
> - For cgroup-wide config, setting it in the seed process of the cgroup would
>    suffice in most cases. Changing it afterwards is more awkward but not
>    hugely so. If racing against forks is a concern, you can either use the
>    freezer or iterate until no new tasks are seen.
> 
> Thanks.
> 

However, we may want to set and keep different slice for processes inside the 
same cgroup.

For example in rich container scenario (as Yongmei mentioned), the administrator 
can decide the cpu resources of a container: its weight(cpu.weight), 
scope(cpuset.cpus), bandwidth(cpu.max), and also the **slice and preempt 
priority** (cpu.fair_slice in this patch).

At the same time, the user may want to decide his processes inside the 
container. He may want to set customized value (sched_attr::sched_runtime) for 
each process, and administrator should not overwrite the user's own config.

So cpu.fair_slice is for preempt competition across cgroups in the samle level, 
while sched_attr::sched_runtime can be used for processes inside the same 
cgroup. (a bit like cpu.weight vs task NICE)

Thanks.