lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 16 May 2024 17:02:51 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Yun Levi <ppbuk5246@...il.com>, Joel Fernandes <joel@...lfernandes.org>,
	Vineeth Pillai <vineeth@...byteword.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	anna-maria@...utronix.de, mingo@...nel.org, tglx@...utronix.de,
	Markus.Elfring@....de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] time/tick-sched: idle load balancing when nohz_full
 cpu becomes idle.

On Thu, May 16, 2024 at 04:45:04PM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2024 at 04:23:31PM +0200, Frederic Weisbecker wrote:
> > On Thu, May 16, 2024 at 04:00:03PM +0200, Peter Zijlstra wrote:
> > > > If I make you annoyed I'm sorry in advance but let me clarify please.
> > > > 
> > > > 1. In case of none-HK-TICK-housekeeping cpu (a.k.a nohz_full cpu),
> > > >     It should be on the null_domain. right?
> > > > 
> > > > 2. If (1) is true, when none-HK-TICK is set, should it set none-HK-DOMAIN
> > > >     to prevent on any sched_domain (cpusets filter out none-HK-DOMAIN cpu)?
> > > > 
> > > > 3. If (1) is true, Is HK_SCHED still necessary? There seems to be no use case
> > > >     and the check for this can be replaced by on_null_domain().
> > > 
> > > I've no idea about all those HK knobs, it's all insane if you ask me.
> > > 
> > > Frederic, afaict all the HK_ goo in kernel/sched/fair.c is total
> > > nonsense, can you please explain?
> > 
> > Yes. Lemme unearth this patch:
> > https://lore.kernel.org/all/20230203232409.163847-2-frederic@kernel.org/
> 
> AFAICT we need more cleanups.

Well, we need to start somewhere :-)

> 
> > Because all we need now is:
> > 
> > _ HK_TYPE_KERNEL_NOISE: nohz_full= or isolcpus=nohz
> > _ HK_TYPE_DOMAIN: isolcpus=domain (or classic isolcpus= alone)
> 
> What does this do?

So housekeeping_cpumask(HK_TYPE_KERNEL_NOISE) will return all
the CPUs not in nohz_full=

That is, all the CPUs that do all the housekeeping work on behalf
of nohz_full CPUs (unbound workqueues and timers, etc...).

Then in a similar way housekeeping_cpumask(HK_TYPE_DOMAIN) are all
the CPUs not in isolcpus= (on_null_domain()). Perhaps that one is
confusing and we should just have a simple isolcpus_cpumask instead?

> 
> > _ HK_TYPE_MANAGED_IRQ: isolcpus=managed_irq
> > 
> > And that's it. Then let's remove HK_TYPE_SCHED that is unused. And then
> > lemme comment the HK_TYPE_* uses within sched/* within the same
> > patchset.
> 
> Please, I find this MISC and DOMAIN stuff confusing, wth does it do? It
> can't possibly be right.

MISC is actually part of what is going to become HK_TYPE_KERNEL_NOISE. It's
an artificial microfeature but it's actually the same as _WORKQUEUE, _TICK, _RCU,
_TIMER, etc... All of which intended to be merged together.

> 
> > Just a question, correct me if I'm wrong, we don't want nohz_full= to ever
> > take the idle load balancer duty (this is what HK_TYPE_MISC prevents in
> > find_new_ilb) because the nohz_full CPU going back to userspace concurrently
> > doesn't want to be disturbed by a loose IPI telling it to do idle balancing. But
> > we still want nohz_full CPUs to be part of nohz.idle_cpus_mask so that the
> > idle balancing can be performed on them by a non isolated CPU. Right?
> 
> I'm confused, none of that makes sense. If you're part of a
> load-balancer, you're part of a load-balancer, no ifs buts or other
> nonsense.
> 
> idle load balancer is no different from regular load balancing.
> 
> Fundamentally, you can't disable the tick if you're part of a
> load-balance group, the load-balancer needs the tick.
> 
> The only possible way to use nohz_full is to not be part of a
> load-balancer, and the only way that is so is by having (lots of) single
> CPU partitions.

So you're suggesting that nohz_full should just be part of the whole
ilb machinery by default (that is, not fiddle with ilb internals) and
then it's up to CPU partitioning (through cpuset or isolcpus) to disable
ilb naturally. Right?

Thanks.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ