linux-kernel - Re: [tip:sched/core] [sched] 2ae891b826: hackbench.throughput 6.2% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <tencent_0B09A23682D4B42D38B522ECD8C0A0ACE507@qq.com>
Date: Tue, 25 Feb 2025 18:15:35 +0800
From: Chen Yu <yu.chen.surf@...mail.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: zihan zhou <15645113830zzh@...il.com>, oe-lkp@...ts.linux.dev,
	kernel test robot <oliver.sang@...el.com>, lkp@...el.com,
	linux-kernel@...r.kernel.org, x86@...nel.org,
	Peter Zijlstra <peterz@...radead.org>, aubrey.li@...ux.intel.com,
	yu.c.chen@...el.com, yu.chen.surf@...il.com
Subject: Re: [tip:sched/core] [sched] 2ae891b826: hackbench.throughput 6.2%
 regression

On 2025-02-25 at 10:45:35 +0100, Vincent Guittot wrote:
> On Tue, 25 Feb 2025 at 10:31, Chen Yu <yu.chen.surf@...mail.com> wrote:
> >
> > On 2025-02-25 at 10:32:13 +0800, kernel test robot wrote:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a 6.2% regression of hackbench.throughput on:
> > >
> > >
> > > commit: 2ae891b826958b60919ea21c727f77bcd6ffcc2c ("sched: Reduce the default slice to avoid tasks getting an extra tick")
> > > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
> > >
> > > [test failed on linux-next/master d4b0fd87ff0d4338b259dc79b2b3c6f7e70e8afa]
> > >
> > > testcase: hackbench
> > > config: x86_64-rhel-9.4
> > > compiler: gcc-12
> > > test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
> > > parameters:
> > >
> > >       nr_threads: 100%
> > >       iterations: 4
> > >       mode: process
> > >       ipc: socket
> > >       cpufreq_governor: performance
> > >
> > >
> > >   39754543 ą  3%     +56.8%   62349308        hackbench.time.involuntary_context_switches
> > >
> >
> > This patch shrinks the base_slice so the deadline is reached earlier to trigger the
> > tick preemption IIUC. For the hackbench case, my assumption is that hackbench seems to
> 
> For systems with more than 8 CPUs, the base slice was
> 0.75*(1+ilog2(8)) = 3ms which is exactly 3 tick periods at 1000hz but
> because the tick period is almost never fully accounted to the task,
> the task was running 4 tick periods instead of 3. The normalized
> base_slice has been reduced from 0.75 to 0.70ms so the base slice
> becomes 2.8ms for 8 CPUs and more and the main result is that tasks
> will run 3 tick periods instead of 4.
>

Thanks for the detailed explanation; I now understand the background.
It is a correct fix for tick preemption and slightly affects wakeup
preemption (smaller deadline in place_entity())

thanks,
Chenyu
 
> > encounter more wakeup preemption and hurts throughtput. If more frequent tick preemption
> > is needed, but more frequent wakeup preemption is not, are we able to do this base_slice
> > shrink for tick preemption only rather than the wakeup preemption? A wild guess, can we
> > add smaller base_slice 0.7 in update_deadline() for tick preemption, but remains the old
> > value 0.75 in update_deadline() for wakeup preemption during enqueue.
> >
> > But consider that the 6% regression is not that high, and the user might customize
> > base_slice via debugfs on-demand, we can keep an eye on this and revist it in the
> > future(we have encountered some SPECjbb regression due to over-preemption).
> >
> > thanks,
> > Chenyu
> >