[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520103647.GH16434@noisy.programming.kicks-ass.net>
Date: Tue, 20 May 2025 12:36:47 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Swapnil Sapkal <swapnil.sapkal@....com>
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
irogers@...gle.com, james.clark@....com, ravi.bangoria@....com,
yu.c.chen@...el.com, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
rostedt@...dmis.org, vincent.guittot@...aro.org,
adrian.hunter@...el.com, kan.liang@...ux.intel.com,
gautham.shenoy@....com, kprateek.nayak@....com,
juri.lelli@...hat.com, yangjihong@...edance.com, void@...ifault.com,
tj@...nel.org, sshegde@...ux.ibm.com, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, santosh.shukla@....com,
ananth.narayan@....com, sandipan.das@....com,
James Clark <james.clark@...aro.org>
Subject: Re: [PATCH v3 4/8] perf sched stats: Add support for report
subcommand
On Tue, Mar 11, 2025 at 12:02:26PM +0000, Swapnil Sapkal wrote:
> `perf sched stats record` captures two sets of samples. For workload
> profile, first set right before workload starts and second set after
> workload finishes. For the systemwide profile, first set at the
> beginning of profile and second set on receiving SIGINT signal.
>
> Add `perf sched stats report` subcommand that will read both the set
> of samples, get the diff and render a final report. Final report prints
> scheduler stat at cpu granularity as well as sched domain granularity.
>
> Example usage:
>
> # perf sched stats record
> # perf sched stats report
>
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category idle> ");
> #endif
> +DOMAIN_FIELD(__u32, idle_lb_count,
> + "load_balance() count on cpu idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, idle_lb_balanced,
> + "load_balance() found balanced on cpu idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, idle_lb_failed,
> + "load_balance() move task failed on cpu idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, idle_lb_imbalance,
> + "imbalance sum on cpu idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, idle_lb_gained,
> + "pull_task() count on cpu idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, idle_lb_hot_gained,
> + "pull_task() when target task was cache-hot on cpu idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, idle_lb_nobusyq,
> + "load_balance() failed to find busier queue on cpu idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, idle_lb_nobusyg,
> + "load_balance() failed to find busier group on cpu idle", "%11u", true, v15);
> +#ifdef DERIVED_CNT_FIELD
> +DERIVED_CNT_FIELD("load_balance() success count on cpu idle", "%11u",
> + idle_lb_count, idle_lb_balanced, idle_lb_failed, v15);
> +#endif
> +#ifdef DERIVED_AVG_FIELD
> +DERIVED_AVG_FIELD("avg task pulled per successful lb attempt (cpu idle)", "%11.2Lf",
> + idle_lb_count, idle_lb_balanced, idle_lb_failed, idle_lb_gained, v15);
> +#endif
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category busy> ");
> +#endif
> +DOMAIN_FIELD(__u32, busy_lb_count,
> + "load_balance() count on cpu busy", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, busy_lb_balanced,
> + "load_balance() found balanced on cpu busy", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, busy_lb_failed,
> + "load_balance() move task failed on cpu busy", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, busy_lb_imbalance,
> + "imbalance sum on cpu busy", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, busy_lb_gained,
> + "pull_task() count on cpu busy", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, busy_lb_hot_gained,
> + "pull_task() when target task was cache-hot on cpu busy", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, busy_lb_nobusyq,
> + "load_balance() failed to find busier queue on cpu busy", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, busy_lb_nobusyg,
> + "load_balance() failed to find busier group on cpu busy", "%11u", true, v15);
> +#ifdef DERIVED_CNT_FIELD
> +DERIVED_CNT_FIELD("load_balance() success count on cpu busy", "%11u",
> + busy_lb_count, busy_lb_balanced, busy_lb_failed, v15);
> +#endif
> +#ifdef DERIVED_AVG_FIELD
> +DERIVED_AVG_FIELD("avg task pulled per successful lb attempt (cpu busy)", "%11.2Lf",
> + busy_lb_count, busy_lb_balanced, busy_lb_failed, busy_lb_gained, v15);
> +#endif
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category newidle> ");
> +#endif
> +DOMAIN_FIELD(__u32, newidle_lb_count,
> + "load_balance() count on cpu newly idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_balanced,
> + "load_balance() found balanced on cpu newly idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_failed,
> + "load_balance() move task failed on cpu newly idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_imbalance,
> + "imbalance sum on cpu newly idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_gained,
> + "pull_task() count on cpu newly idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_hot_gained,
> + "pull_task() when target task was cache-hot on cpu newly idle", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_nobusyq,
> + "load_balance() failed to find busier queue on cpu newly idle", "%11u", true, v15);
> +DOMAIN_FIELD(__u32, newidle_lb_nobusyg,
> + "load_balance() failed to find busier group on cpu newly idle", "%11u", true, v15);
> +#ifdef DERIVED_CNT_FIELD
> +DERIVED_CNT_FIELD("load_balance() success count on cpu newly idle", "%11u",
> + newidle_lb_count, newidle_lb_balanced, newidle_lb_failed, v15);
> +#endif
> +#ifdef DERIVED_AVG_FIELD
> +DERIVED_AVG_FIELD("avg task pulled per successful lb attempt (cpu newly idle)", "%11.2Lf",
> + newidle_lb_count, newidle_lb_balanced, newidle_lb_failed, newidle_lb_gained, v15);
> +#endif
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category active_load_balance()> ");
> +#endif
> +DOMAIN_FIELD(__u32, alb_count,
> + "active_load_balance() count", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, alb_failed,
> + "active_load_balance() move task failed", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, alb_pushed,
> + "active_load_balance() successfully moved a task", "%11u", false, v15);
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category sched_balance_exec()> ");
> +#endif
> +DOMAIN_FIELD(__u32, sbe_count,
> + "sbe_count is not used", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, sbe_balanced,
> + "sbe_balanced is not used", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, sbe_pushed,
> + "sbe_pushed is not used", "%11u", false, v15);
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Category sched_balance_fork()> ");
> +#endif
> +DOMAIN_FIELD(__u32, sbf_count,
> + "sbf_count is not used", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, sbf_balanced,
> + "sbf_balanced is not used", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, sbf_pushed,
> + "sbf_pushed is not used", "%11u", false, v15);
> +#ifdef DOMAIN_CATEGORY
> +DOMAIN_CATEGORY(" <Wakeup Info> ");
> +#endif
> +DOMAIN_FIELD(__u32, ttwu_wake_remote,
> + "try_to_wake_up() awoke a task that last ran on a diff cpu", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, ttwu_move_affine,
> + "try_to_wake_up() moved task because cache-cold on own cpu", "%11u", false, v15);
> +DOMAIN_FIELD(__u32, ttwu_move_balance,
> + "try_to_wake_up() started passive balancing", "%11u", false, v15);
> +#endif /* DOMAIN_FIELD */
So I have one request for a future version of this. Could we please add
a knob to print the output using the field name instead of the fancy
pants description?
It is *MUCH* easier to grep the field name in the code than to try and
figure out wth this description is on about :-)
That is, ttwu_wake_remote is infinitely better than "try_to_wake_up()
awake a task that last ran on a diff cpu" and so on.
I realize I might be weird, but it should be simple enough to add and it
makes my life easier :-)
Powered by blists - more mailing lists