lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fX=QuRw8M4Xco=A8WbBfDSPmyjtQ-JdnBEM-VRdX65_1g@mail.gmail.com>
Date: Wed, 21 Jan 2026 09:12:20 -0800
From: Ian Rogers <irogers@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Chen, Yu C" <yu.c.chen@...el.com>, Swapnil Sapkal <swapnil.sapkal@....com>, ravi.bangoria@....com, 
	mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org, 
	rostedt@...dmis.org, vincent.guittot@...aro.org, adrian.hunter@...el.com, 
	kan.liang@...ux.intel.com, gautham.shenoy@....com, kprateek.nayak@....com, 
	juri.lelli@...hat.com, yangjihong@...edance.com, void@...ifault.com, 
	tj@...nel.org, sshegde@...ux.ibm.com, ctshao@...gle.com, 
	quic_zhonhan@...cinc.com, thomas.falcon@...el.com, blakejones@...gle.com, 
	ashelat@...hat.com, leo.yan@....com, dvyukov@...gle.com, ak@...ux.intel.com, 
	yujie.liu@...el.com, graham.woodward@....com, ben.gainey@....com, 
	vineethr@...ux.ibm.com, tim.c.chen@...ux.intel.com, linux@...blig.org, 
	santosh.shukla@....com, sandipan.das@....com, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org, mingo@...hat.com, namhyung@...nel.org, 
	james.clark@....com, acme@...nel.org
Subject: Re: [PATCH v5 00/10] perf sched: Introduce stats tool

On Wed, Jan 21, 2026 at 8:33 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Jan 22, 2026 at 12:09:25AM +0800, Chen, Yu C wrote:
> > On 1/20/2026 1:58 AM, Swapnil Sapkal wrote:
> > > MOTIVATION
> > > ----------
> > >
> > > Existing `perf sched` is quite exhaustive and provides lot of insights
> > > into scheduler behavior but it quickly becomes impractical to use for
> > > long running or scheduler intensive workload. For ex, `perf sched record`
> > > has ~7.77% overhead on hackbench (with 25 groups each running 700K loops
> > > on a 2-socket 128 Cores 256 Threads 3rd Generation EPYC Server), and it
> > > generates huge 56G perf.data for which perf takes ~137 mins to prepare
> > > and write it to disk [1].
> > >
> > > Unlike `perf sched record`, which hooks onto set of scheduler tracepoints
> > > and generates samples on a tracepoint hit, `perf sched stats record` takes
> > > snapshot of the /proc/schedstat file before and after the workload, i.e.
> > > there is almost zero interference on workload run. Also, it takes very
> > > minimal time to parse /proc/schedstat, convert it into perf samples and
> > > save those samples into perf.data file. Result perf.data file is much
> > > smaller. So, overall `perf sched stats record` is much more light weight
> > > compare to `perf sched record`.
> > >
> > > We, internally at AMD, have been using this (a variant of this, known as
> > > "sched-scoreboard"[2]) and found it to be very useful to analyse impact
> > > of any scheduler code changes[3][4]. Prateek used v2[5] of this patch
> > > series to report the analysis[6][7].
> > >
> > > Please note that, this is not a replacement of perf sched record/report.
> > > The intended users of the new tool are scheduler developers, not regular
> > > users.
> > >
> > > USAGE
> > > -----
> > >
> > >    # perf sched stats record
> > >    # perf sched stats report
> > >    # perf sched stats diff
> > >
> > > Note: Although `perf sched stats` tool supports workload profiling syntax
> > > (i.e. -- <workload> ), the recorded profile is still systemwide since the
> > > /proc/schedstat is a systemwide file.
> > >
> >
> > I found this is useful for load balance analysis on my
> > 384 CPUs system with 6.19.0-rc1, please feel free to add
> >
> > Tested-by: Chen Yu <yu.c.chen@...el.com>
>
> Yeah, I've used a previous version for a while, was very nice.
>
> Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>

Acked-by: Ian Rogers <irogers@...gle.com>

I'm still wondering if we can make some of the /proc/schedstat data
appear as tool events similar to proposals for networking and memory
tool events in:
https://lore.kernel.org/lkml/20260104011738.475680-1-irogers@google.com/

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ