[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YMJqd1JJcFTThH8j@hirez.programming.kicks-ass.net>
Date: Thu, 10 Jun 2021 21:39:35 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Waiman Long <llong@...hat.com>
Cc: Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Shuah Khan <shuah@...nel.org>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kselftest@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <guro@...com>, Phil Auld <pauld@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>
Subject: Re: [PATCH 2/5] cgroup/cpuset: Add new cpus.partition type with no
load balancing
On Thu, Jun 10, 2021 at 03:16:29PM -0400, Waiman Long wrote:
> On 6/10/21 2:50 PM, Peter Zijlstra wrote:
> > On Thu, Jun 03, 2021 at 05:24:13PM -0400, Waiman Long wrote:
> > > Cpuset v1 uses the sched_load_balance control file to determine if load
> > > balancing should be enabled. Cpuset v2 gets rid of sched_load_balance
> > > as its use may require disabling load balancing at cgroup root.
> > >
> > > For workloads that require very low latency like DPDK, the latency
> > > jitters caused by periodic load balancing may exceed the desired
> > > latency limit.
> > >
> > > When cpuset v2 is in use, the only way to avoid this latency cost is to
> > > use the "isolcpus=" kernel boot option to isolate a set of CPUs. After
> > > the kernel boot, however, there is no way to add or remove CPUs from
> > > this isolated set. For workloads that are more dynamic in nature, that
> > > means users have to provision enough CPUs for the worst case situation
> > > resulting in excess idle CPUs.
> > >
> > > To address this issue for cpuset v2, a new cpuset.cpus.partition type
> > > "root-nolb" is added which allows the creation of a cpuset partition with
> > > no load balancing. This will allow system administrators to dynamically
> > > adjust the size of the no load balancing partition to the current need
> > > of the workload without rebooting the system.
> > I'm confused, why do you need this? Just create a parition for each cpu.
> >
> From a management point of view, it is more cumbersome to do one cpu per
> partition. I have suggested this idea of 1 cpu per partition to the
> container developers, but they don't seem to like it.
Oh, because it then creates a cgroup tree per CPU and you get to move
tasks between cgroups?
OK I suppose.
Powered by blists - more mailing lists