lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sejew87r.ffs@tglx>
Date: Thu, 03 Jul 2025 00:15:04 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Christoph Lameter via B4 Relay <devnull+cl.gentwo.org@...nel.org>,
 Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker
 <frederic@...nel.org>, Ingo Molnar <mingo@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, sh@...two.org, Darren
 Hart <dvhart@...radead.org>, "Christoph Lameter (Ampere)" <cl@...two.org>
Subject: Re: [PATCH] Skew tick for systems with a large number of processors

Christoph!

On Wed, Jul 02 2025 at 12:42, Christoph Lameter via wrote:

Subject starts with a subsystem followed by a colon and then the short
log. That has been that way forever and is clearly documented. You're
not really new to kernel development and I pointed that out to you
before:

  https://lore.kernel.org/all/87o74m1oq7.ffs@tglx

No?

> From: "Christoph Lameter (Ampere)" <cl@...two.org>
>
> Synchronized ticks mean that all processors will simultaneously process
> ticks and enter the scheduler. So the contention increases as the number
> of cpu increases. The contention causes latency jitter that scales with
> the number of processors.
>
> Staggering the timer interrupt also helps mitigate voltage droop related
> issues that may be observed in SOCs with large core counts.
> See https://semiengineering.com/mitigating-voltage-droop/ for a more
> detailed explanation.
>
> Switch to skewed tick for systems with more than 64 processors.

This lacks a proper explanation why that does not have any negative side
effects on existing deployments and application scenarios.

> --- a/kernel/Kconfig.hz
> +++ b/kernel/Kconfig.hz

The tick related Kconfig options are in kernel/time/Kconfig

> +
> +config TICK_SKEW_LIMIT
> +	int
> +	default 64

That wants a

        range 0 NR_CPUS

or such

> +	help
> +	  If the kernel is booted on systems with a large number of cpus then the
> +	  concurrent execution of timer ticks causes long holdoffs due to
> +	  serialization. Synchrononous executions of interrupts can also cause
> +	  voltage droop in SOCs. So switch to skewed mode. This mechanism

What does 'So switch to skewed mode.' help the user to select any
useful value?

This wants to have a proper explanation for picking a value which is
understandable by mere mortals and not some useless "expert" word salad.

> +	  can be overridden by specifying "tick_skew=x" on the kernel command line.

Neither does it explain how that override affects the chosen value nor
update the documentation of the command line value to make users aware
of this behavioural change. For the casual reader this suggests, that
tick_skew=x allows to change that number on the kernel command line,
which it does not.

> -static int sched_skew_tick;
> +static int sched_skew_tick = -1;

What's this magic -1 here? Can we please have some obvious and
understandable define for this?

>  static int __init skew_tick(char *str)
>  {
> @@ -1572,6 +1572,16 @@ void tick_setup_sched_timer(bool hrtimer)
>  {
>  	struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
>  
> +	/* Figure out if we should skew the tick */
> +	if (sched_skew_tick < 0) {

This is incompatible with the existing code, which is unfortunately
stupid already. Today 'tick_skew=-1' causes the tick to be skewed. Now
it gets a different meaning. Not that it matters much, but change logs
are supposed to mention user visible behavioural differences and argue
why they don't matter, no?

> +		if (num_possible_cpus() >= CONFIG_TICK_SKEW_LIMIT) {
> +			sched_skew_tick = 1;
> +			pr_info("Tick skewed mode enabled. Possible cpus %u > %u\n",
> +				num_possible_cpus(), CONFIG_TICK_SKEW_LIMIT);

I'm not convinced that this is useful, but that's the least of the issues.

> +		} else

The else clause wants curly brackets for symmetry.

> +			sched_skew_tick = 0;

The above aside. As you completely failed to provide at least the
minimal historical background in the change log, let me fill in the
blanks.

commit 3704540b4829 ("tick management: spread timer interrupt") added the
skew unconditionally in 2007 to avoid lock contention on xtime lock.

commit af5ab277ded0 ("clockevents: Remove the per cpu tick skew")
removed it in 2010 because the xtime lock contention was gone and the
skew affected the power consumption of slightly loaded _large_ servers.

commit 5307c9556bc1 ("tick: Add tick skew boot option") brought it back
with a command line option to address contention and jitter issues on
larger systems.

So while you preserved the behaviour of the command line option in the
most obscure way, you did not even make an attempt to explain why this
change does not bring back the issues which caused the removal in commit
af5ab277ded0 or why they are irrelevant today.

"Scratches my itch" does not work and you know that. This needs to be
consolidated both on the implementation side and also on the user
side.

Thanks for making me do your homework,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ