lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250210010925.2eyn42dj7mbft7em@airbuntu>
Date: Mon, 10 Feb 2025 01:09:25 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: John Stultz <jstultz@...gle.com>, LKML <linux-kernel@...r.kernel.org>,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Frederic Weisbecker <frederic@...nel.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Stephen Boyd <sboyd@...nel.org>, Yury Norov <yury.norov@...il.com>,
	Bitao Hu <yaoma@...ux.alibaba.com>,
	Andrew Morton <akpm@...ux-foundation.org>, kernel-team@...roid.com
Subject: Re: [RFC][PATCH 0/3] DynamicHZ: Configuring the timer tick rate at
 boot time

On 01/28/25 17:46, Thomas Gleixner wrote:

> > However, having to select the system HZ value at build time is
> > somewhat limiting. Distros have to make choices for their users
> > as to what the best HZ value would be balancing latency and
> > power usage.
> >
> > With Android, this is a major issue, as we have one GKI binary
> > that runs across a wide array of devices from top of the line
> > flagship phones to watches. Balancing the choice for HZ is
> > difficult, we currently have HZ=250, but some devices would love
> > to have HZ=1000, while other devices aren’t willing to pay the
> > power cost of 4x the timer slots, resulting in shorter idle
> > times.
> 
> The shorter idle times are because timer wheel timers wake up more
> accurately with HZ=1000 and not because the scheduler is more agressive?

You'll all find another patch [1] in your inbox which changes the default to
1ms. And (hopefully) explains the details of why higher values are bad for
modern systems/workloads. And why power could be expectedly worse in a number
of use cases.

TLDR, beside the reason above the system is generally responsive. It was by
accident ending up stuck at lower frequencies and in case of HMP systems on
smaller cores. With faster tick we respond faster to demand shifting
frequencies higher and biasing task placement to bigger core fixing performance
issues along the way.

> Aside of that, using random HZ values is a pretty academic exercise and
> HZ=300 had been introduced for multimedia to cater for 30FPS. But that
> was long ago when high resolution timers, NOHZ and modern graphic
> devices did not exist.
> 
> I seriously doubt that HZ=300 has any actual advantage on modern
> systems. Sure, I know that SteamOS uses HZ=300, but AFAICT from public
> discussions this just caters to the HZ=300 myth and is not backed by any
> factual evidence that HZ=300 is so superior. Quite the contrary there
> are enough people who actually want HZ=1000 for better responsiveness.

These lower HZ values don't make sense to me either and I think keeping them is
just giving people more means to shoot themselves in the foot. HZ=100 irks
particularly as to my humble understanding it is there to help throughput, but
I truly doubt this is a sensible configuration anymore. I believe they are
better with setting the base_slice to 10ms by default now while still keep
HZ=1000. But there could be sensible reasons why it is useful beyond my
knowledge.

My biggest worry this could be a common source of 'sched latency'. The ancient
trade-offs don't hold anymore IMHO.

FWIW I had a series that converted HZ into a variable and did a good chunk of
work to convert a large number of users and conversion code. Sadly I lost it
:''( Though it seems you had another idea on how it should be done. So maybe
I shouldn't cry too hard on losing it.

Not trying to side track the discussion. But just wanted to point out maybe we
should do some clean up on current supported HZ values and think harder whether
anything but HZ=1000 makes sense still. Having the dynamic option is great, but
my gut feeling is that we want to have a single HZ=1000 and fix the potential
remaining issues that could make TICK disturb idle (if any). I am not buying
the throughput argument, but have little experience with the monstrous
machines. If folks with big machine see throughput issues, it'd be great to
learn about those. As I tried to argue in my patch [1], I believe latency is
dominant on modern systems. It has important consequences on scheduler
decisions. Peter and Vincent can correct me if I went astray.

[1] https://lore.kernel.org/lkml/20250210001915.123424-1-qyousef@layalina.io/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ