linux-kernel - Re: [PATCH V3 1/2] sched: Reduce the default slice to avoid tasks getting an extra tick

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250210061855.299323-1-15645113830zzh@gmail.com>
Date: Mon, 10 Feb 2025 14:18:56 +0800
From: zihan zhou <15645113830zzh@...il.com>
To: qyousef@...alina.io
Cc: 15645113830zzh@...il.com,
	bsegall@...gle.com,
	dietmar.eggemann@....com,
	juri.lelli@...hat.com,
	linux-kernel@...r.kernel.org,
	mgorman@...e.de,
	mingo@...hat.com,
	peterz@...radead.org,
	rostedt@...dmis.org,
	vincent.guittot@...aro.org,
	vschneid@...hat.com
Subject: Re: [PATCH V3 1/2] sched: Reduce the default slice to avoid tasks getting an extra tick

Thank you for your comments!

> I brought the topic up of these magic values with Peter and Vincent in LPC as
> I think this logic is confusing. I have nothing against your patch, but if the
> maintainers agree I am in favour of removing it completely in favour of setting
> it to a single value that is the same across all systems.

Here is my shallow understanding:
I think when the number of cpus is small, this type of machine is usually
a desktop. If the slice is still relatively large, a task has to have a
longer wake-up delay, which may result in a poorer user experience. When
there are a large number of cpus, it is likely to mean that the machine is
a server, its tasks often are batch workloads, a slight increase in slice
is acceptable. And a server often has idle cpus or cpus with low load, So
even if there are larger slice, the interaction experience is also not bad.

> I do think 1ms makes more sense as a default value given how modern workloads
> need faster responsiveness across the board. But keeping it 3ms to avoid much
> disturbance would be fine. We could also make it equal to TICK_MSEC (this
> define doesn't exist) if it is higher than 3ms.

I don't quite understand this. What is TICK_MSEC? If HZ=1000, then
TICK_MSEC=1ms? Why is it said that more than 3ms (slice) equals 1ms (tick)?

It seems that this value was originally designed for
sysctl_sched_wakeup_granularity. CFS does not force tasks to switch after
running for this time, but EEVDF does require it, So if slice is too small
like 1ms, it looks not conducive to cache, and is not good for batch
workloads.

> Do you use HZ=100 by the way? If yes, are you able to share the reasons? This
> configuration is too aggressive and bad for latencies and I doubt this tweak of
> the formula will make things better to them anyway.

I don't use HZ=100, in fact, all the machines I use have HZ=1000 and more
than 8 cpus, so I'm not familiar with some scenarios.

I think that if the slice is smaller than tick (10ms), there is not much
difference between 3ms slice and 1ms slice in tick preemption, but the two
are still different in wake-up preemption. After all, when waking up
preemption, there also has update_curr->update_deadline, and the wake-up
latency should be slightly lower with 1ms slice. So I think, when HZ=100,
different slices still have an impact on latency.