lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240904130445.GI4723@noisy.programming.kicks-ass.net>
Date: Wed, 4 Sep 2024 15:04:45 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Waiman Long <longman@...hat.com>, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] sched/isolation: Add HK_FLAG_SCHED to nohz_full

On Wed, Sep 04, 2024 at 02:44:26PM +0200, Frederic Weisbecker wrote:
> Le Tue, Sep 03, 2024 at 09:23:53PM -0400, Waiman Long a écrit :
> > > After discussing with Peter lately, the rules should be:
> > > 
> > > 1) If a nohz_full CPU is part of a multi-CPU domain, then it should
> > >     be part of load balancing. Peter even says that nohz_full should be
> > >     forbidden in this case, because the tick plays a role in the
> > >     load balancing.
> > 
> > My understand is that most users will use nohz_full together with isolcpus.
> > So nohz_full CPUs are also isolated and not in a sched domain. There may
> > still be user setting nohz_full without isolcpus though, but that should be
> > relatively rare.
> 
> Apparently there are users wanting to use isolation along with automatic
> containers deployments such as kubernetes, which doesn't seem to work
> well with isolcpus...

I've been proposing to get rid of isolcpus for at least the last 15
years or so. There just isn't a good reason to ever use it. We were
close and then the whole NOHZ_FULL thing came along.

You can create single CPU partitions using cpusets dynamically.

> > Anyway, all these nohz_full/kernel_nose setting will only apply to CPUs in
> > isolated cpuset partitions which will not be in a sched domain.
> > 
> > > 
> > > 2) Otherwise, if CPU is not part of a domain or it is the only CPU of all its
> > >     domains, then it can be out of the load balancing machinery.
> > I am aware that a single-cpu domain is the same as being isolated with no
> > load balancing.
> 
> By the way is it possible to have a single-cpu domain (sorry I'm a noob here)
> or do such CPU always end up on a null domain?

IIRC they always end up with the null domain; but its been a while. It
simply doesn't make much sense to have a 1 cpu domain. The way the
topology code works is by always building the full domain tree, and then
throwing away all levels that do not contribute, and in the 1 cpu case,
that would be all of them.

Look for 'degenerate' in kernel/sched/topology.c.

> > > 
> > > I'm a bit scared about rule 1) because I know there are existing users of
> > > nohz_full on multi-CPU domains... So I feel a bit trapped.
> > 
> > As stated before, this is not a common use case.
> 
> Not sure and anyway it's not a forbidden usecase. But this is anyway outside
> the scope of this patchset.

Most crucially, it is a completely broken setup. It doesn't actually
work well.

Taking it away will force people to fix their broken. That's a good
thing, no?

> > The isolcpus boot option is deprecated, as stated in kernel-parameters.txt.
> 
> We should undeprecate it, apparently it's still widely used. Perhaps by people
> who can't afford to use cpusets/cgroups.

What is the actual problem with using cpusets? At the very least the
whole nohz_full thing needs to be moved into cpusets so it isn't a fixed
boot time thing anymore.

> > My plan is to deprecate nohz_full as well once we are able to make dynamic
> > CPU isolation via cpuset works almost as good as isolcpus + nohz_full.
> 
> You can't really deprecate such a kernel boot option unfortunately. Believe me
> I wish we could.

Why not? As I said, the only thing that's kept it around, and worse,
made it more popular again, is this nohz_full nonsense. That never
should've used isolcpus, but that's not something we can do anything
about now.

Rigid, boot time only things are teh suck.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ