lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ms9lwscq.ffs@tglx>
Date: Thu, 03 Jul 2025 11:12:21 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Christoph Lameter via B4 Relay <devnull+cl.gentwo.org@...nel.org>,
 Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker
 <frederic@...nel.org>, Ingo Molnar <mingo@...nel.org>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, sh@...two.org, Darren
 Hart <dvhart@...radead.org>, Arjan van de Ven <arjan@...radead.org>
Subject: Re: [PATCH] Skew tick for systems with a large number of processors

On Wed, Jul 02 2025 at 17:25, Christoph Lameter wrote:
> On Thu, 3 Jul 2025, Thomas Gleixner wrote:
>
>> The above aside. As you completely failed to provide at least the
>> minimal historical background in the change log, let me fill in the
>> blanks.
>>
>> commit 3704540b4829 ("tick management: spread timer interrupt") added the
>> skew unconditionally in 2007 to avoid lock contention on xtime lock.
>
> Right but that was only one reason why the timer interrupts where
> staggered.

It was the main reason because all CPUs contended on xtime lock and
other global locks. The subsequent issues you describe were not
observable back then to the extent they are today for bloody obvious
reasons.

>> commit af5ab277ded0 ("clockevents: Remove the per cpu tick skew")
>> removed it in 2010 because the xtime lock contention was gone and the
>> skew affected the power consumption of slightly loaded _large_ servers.
>
> But then the tick also executes other code that can cause contention. Why
> merge such an obvious problematic patch without considering the reasons
> for the 2007 patch?

As I said above, the main reason was contention on xtime lock and some
other global locks. These contention issues had been resolved over time,
so the initial reason to have the skew was gone.

The power consumption issue was a valid reason to remove it and the
testing back then did not show any negative side effects.

The subsequently discovered issues, were not observable and some of them
got introduced by later code changes.

Obviously the patch is problematic in hindsight, but hindsight is always
20/20.

>> commit 5307c9556bc1 ("tick: Add tick skew boot option") brought it back
>> with a command line option to address contention and jitter issues on
>> larger systems.
>
> And then issues resulted because the scaling issues where not
> considered when merging the 2010 patch.

What are you trying to do here? Playing a blame game is not helping to
find a solution.

>> So while you preserved the behaviour of the command line option in the
>> most obscure way, you did not even make an attempt to explain why this
>> change does not bring back the issues which caused the removal in commit
>> af5ab277ded0 or why they are irrelevant today.
>
> As pointed out in the patch description: The synchronized tick (aside from
> the jitter) also causes power spikes on large core systems which can cause
> system instabilities.

That's a _NEW_ problem and has nothing to do with the power saving
concerns which led to af5ab277ded0. 

>> "Scratches my itch" does not work and you know that. This needs to be
>> consolidated both on the implementation side and also on the user
>> side.
>
> We can get to that but I at least need some direction on how to approach
> this and figure out the concerns that exist. Frankly my initial idea was
> just to remove the buggy patches since this caused a regression in
> performance and system stability but I guess there were power savings
> concerns.

Guessing is not a valid engineering approach, as you might know already.

It's not rocket science to validate whether these power saving concerns
still apply and to reach out to people who have been involved in this
and ask them to revalidate. I just Cc'ed Arjan for you.

> How can we address this issue in a better way then?

By analysing the consequences of flipping the default for skew_tick to
default on, which can be evaluated upfront trivially without a single
line of code change by adding 'skew_tick=1' to the kernel command line
and running tests and asking others to help evaluating.

There is only a limited range of scenarios, which need to be looked at:

      - Big servers and the power saving issues on lightly loaded
        machines

      - Battery operated devices

      - Virtualization (guests)

That might not cover 100% of the use cases, but should be a good enough
coverage to base an informed decision on.

> The kernel should not come up all wobbly and causing power spikes
> every tick.

The kernel should not do a lot of things, but does them due to
historical decisions, which turn out to be suboptimal when technology
advances. The power spike problem simply did not exist 18 years ago at
least not to the extent that it mattered or caused concerns.

If we could have predicted the future and the consequences of ad hoc
decisions, we wouldn't have had a BKL, which took only 20 years of
effort to get rid of (except for the well hidden leftovers in tty).

But what we learned from the past is to avoid hacky ad hoc workarounds,
which are guaranteed to just make the situation worse.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ