lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_0D90CA46263E204153ED47C700DCDE189E07@qq.com>
Date: Fri, 19 Dec 2025 13:03:51 +0800
From: Yangyu Chen <cyy@...self.name>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>,
 K Prateek Nayak <kprateek.nayak@....com>,
 "Gautham R . Shenoy" <gautham.shenoy@....com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Chen Yu <yu.c.chen@...el.com>,
 Juri Lelli <juri.lelli@...hat.com>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>,
 Valentin Schneider <vschneid@...hat.com>,
 Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
 Hillf Danton <hdanton@...a.com>,
 Shrikanth Hegde <sshegde@...ux.ibm.com>,
 Jianyong Wu <jianyong.wu@...look.com>,
 Tingyin Duan <tingyin.duan@...il.com>,
 Vern Hao <vernhao@...cent.com>,
 Vern Hao <haoxing990@...il.com>,
 Len Brown <len.brown@...el.com>,
 Aubrey Li <aubrey.li@...el.com>,
 Zhao Liu <zhao1.liu@...el.com>,
 Chen Yu <yu.chen.surf@...il.com>,
 Adam Li <adamli@...amperecomputing.com>,
 Aaron Lu <ziqianlu@...edance.com>,
 Tim Chen <tim.c.chen@...el.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 21/23] -- DO NOT APPLY!!! -- sched/cache/stats: Add
 schedstat for cache aware load balancing


> On 4 Dec 2025, at 07:07, Tim Chen <tim.c.chen@...ux.intel.com> wrote:
> 
> From: Chen Yu <yu.c.chen@...el.com>
> 
> Debug patch only.
> 
> With cache-aware load balancing enabled, statistics related to its activity
> are exposed via /proc/schedstat and debugfs. For instance, if users want to
> verify metrics like the number of exceeding RSS and nr_running limits, they
> can filter the output of /sys/kernel/debug/sched/debug and compute the required
> statistics manually:
> 
> llc_exceed_cap SUM: 6
> llc_exceed_nr SUM: 4531
> 
> Furthermore, these statistics exposed in /proc/schedstats can be queried manually
> or via perf sched stats[1] with minor modifications.
> 

Hi Tim,

This patch looks great, especially for multithread Verilator workloads
on clustered LLC (like AMD EPYC). I'm discussing with Verilator
upstream to disable automatic userspace affinity assignment in
Verilator if such feature exist [1]. During the discussion, I think
there should be a way for userspace software to detect if such a
feature exists. Could we expose it in `/proc/schedstats` to allow
userspace software to detect such a feature? We can just use this
patch and remove the "DO NOT APPLY" tag.

[1] https://github.com/verilator/verilator/issues/6826#issuecomment-3671287551

Thanks,
Yangyu Chen

> Link: https://lore.kernel.org/all/20250909114227.58802-1-swapnil.sapkal@amd.com #1
> 
> Signed-off-by: Chen Yu <yu.c.chen@...el.com>
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
> include/linux/sched/topology.h | 1 +
> kernel/sched/fair.c            | 1 +
> kernel/sched/stats.c           | 5 +++--
> 3 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 0ba4697d74ba..8702c1e731a0 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -108,6 +108,7 @@ struct sched_domain {
> unsigned int lb_imbalance_util[CPU_MAX_IDLE_TYPES];
> unsigned int lb_imbalance_task[CPU_MAX_IDLE_TYPES];
> unsigned int lb_imbalance_misfit[CPU_MAX_IDLE_TYPES];
> + unsigned int lb_imbalance_llc[CPU_MAX_IDLE_TYPES];
> unsigned int lb_gained[CPU_MAX_IDLE_TYPES];
> unsigned int lb_hot_gained[CPU_MAX_IDLE_TYPES];
> unsigned int lb_nobusyg[CPU_MAX_IDLE_TYPES];
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a2e2d6742481..742e455b093e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -12684,6 +12684,7 @@ static void update_lb_imbalance_stat(struct lb_env *env, struct sched_domain *sd
> __schedstat_add(sd->lb_imbalance_misfit[idle], env->imbalance);
> break;
> case migrate_llc_task:
> + __schedstat_add(sd->lb_imbalance_llc[idle], env->imbalance);
> break;
> }
> }
> diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
> index d1c9429a4ac5..3736f6102261 100644
> --- a/kernel/sched/stats.c
> +++ b/kernel/sched/stats.c
> @@ -104,7 +104,7 @@ void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p,
>  * Bump this up when changing the output format or the meaning of an existing
>  * format, so that tools can adapt (or abort)
>  */
> -#define SCHEDSTAT_VERSION 17
> +#define SCHEDSTAT_VERSION 18
> 
> static int show_schedstat(struct seq_file *seq, void *v)
> {
> @@ -139,7 +139,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
> seq_printf(seq, "domain%d %s %*pb", dcount++, sd->name,
>   cpumask_pr_args(sched_domain_span(sd)));
> for (itype = 0; itype < CPU_MAX_IDLE_TYPES; itype++) {
> - seq_printf(seq, " %u %u %u %u %u %u %u %u %u %u %u",
> + seq_printf(seq, " %u %u %u %u %u %u %u %u %u %u %u %u",
>    sd->lb_count[itype],
>    sd->lb_balanced[itype],
>    sd->lb_failed[itype],
> @@ -147,6 +147,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
>    sd->lb_imbalance_util[itype],
>    sd->lb_imbalance_task[itype],
>    sd->lb_imbalance_misfit[itype],
> +    sd->lb_imbalance_llc[itype],
>    sd->lb_gained[itype],
>    sd->lb_hot_gained[itype],
>    sd->lb_nobusyq[itype],
> -- 
> 2.32.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ