[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260205063714.2579-1-hdanton@sina.com>
Date: Thu, 5 Feb 2026 14:37:11 +0800
From: Hillf Danton <hdanton@...a.com>
To: 连子涵 <17317795071@....com>
Cc: tglx@...utronix.de,
"Christoph Lameter (Ampere)" <cl@...two.org>,
linux-kernel@...r.kernel.org
Subject: Re: [Question] Voltage droop from synchronized timer interrupts(tick) on many-core SoCs leads to system instability
On Thu, 5 Feb 2026 12:52:04 +0800 (CST) =?GBK?B?wazX07qt?= wrote:
> Hi all,
> We have observed a critical voltage droop issue on large-core-count SoC platforms (e.g., 64+ cores) that appears to stem directly from the synchronized periodic timer interrupts(tick) in the Linux kernel.
>
> In our testing and power simulations, we found that:
> When all CPU cores enter the timer interrupt handler simultaneously, there is a sharp, instantaneous power surge and continuous power fluctuations during the interrupt handling window (which lasts several microseconds), leading to significant voltage droop. In severe cases, this droop can cause system instability or even prevent the OS from booting.
>
> We understand that enabling skew_tick=1 effectively mitigates this by staggering the per-CPU tick timers. However, in certain deployment scenarios, modifying any kernel boot parameter—including skew_tick—is not permitted.
>
> Given this constraint, we would greatly appreciate your insights on the following technical questions:
> 1. Why does the timer interrupt path consume so much power and exhibit such large instantaneous variations? Our power simulation shows that the average power during timer interrupt handling is comparable to Dhrystone benchmark.
> 2. What is the typical duration of a single timer interrupt handler (tick_nohz_handler, etc.) on a modern x86 or ARM core? Is it generally on the order of a few microseconds?
> 3. Beyond skew_tick=1, are there other kernel mechanisms or runtime strategies that could reduce the power impact of synchronized timer events? Are there plans in future kernel versions to address this issue more fundamentally—especially for many-core platforms?
>
>
> Thank you very much for your time and expertise.
>
Sounds like a known issue, feel free to see the comments in 2025 [1].
[1] Subject: Re: [PATCH] Skew tick for systems with a large number of processors
https://lore.kernel.org/lkml/87sejew87r.ffs@tglx/
>
> Best regards,
> Zihan Lian <17317795071@....com>
Powered by blists - more mailing lists