lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24d61637-cfd7-414b-899f-856c9863dc1f@intel.com>
Date: Wed, 9 Jul 2025 15:22:29 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
 Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
	<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
	<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
	<Dave.Martin@....com>, Anil Keshavamurthy <anil.s.keshavamurthy@...el.com>,
	Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v6 29/30] x86/resctrl: Add debug info/PERF_PKG_MON/status
 files

Hi Tony,

Subject needs an update?

On 6/26/25 9:49 AM, Tony Luck wrote:
> Each telemetry aggregator provides three status registers at the top
> end of MMIO space after all the per-RMID per-event counters:
> 
>   agg_data_loss_count: This counts the number of times that this aggregator
>   failed to accumulate a counter value supplied by a CPU core.
> 
>   agg_data_loss_timestamp: This is a "timestamp" from a free running
>   25MHz uncore timer indicating when the most recent data loss occurred.
> 
>   last_update_timestamp: Another 25MHz timestamp indicating when the
>   most recent counter update was successfully applied.
> 
> Create files in /sys/kernel/debug/resctrl/info/PERF_PKG_MON/arch/

"/sys/kernel/debug/resctrl/info/PERF_PKG_MON/{arch}/" or
"/sys/kernel/debug/resctrl/info/PERF_PKG_MON/x86_64/"?

> to display the value of each of these status registers for each aggregator
> in each enabled event group.
> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
>  arch/x86/kernel/cpu/resctrl/intel_aet.c | 56 +++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> index 090e7b35c3e2..422e3e126255 100644
> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -13,6 +13,7 @@
>  
>  #include <linux/cleanup.h>
>  #include <linux/cpu.h>
> +#include <linux/debugfs.h>
>  #include <linux/intel_vsec.h>
>  #include <linux/io.h>
>  #include <linux/minmax.h>
> @@ -275,6 +276,58 @@ static bool get_pmt_feature(enum pmt_feature_id feature)
>  	return false;
>  }
>  
> +static ssize_t status_read(struct file *f, char __user *buf, size_t count, loff_t *off)
> +{
> +	void __iomem *info = (void __iomem *)f->f_inode->i_private;
> +	char status[32];
> +	int len;
> +
> +	len = sprintf(status, "%llu\n", readq(info));
> +
> +	return simple_read_from_buffer(buf, count, off, status, len);
> +}
> +
> +static const struct file_operations status_fops = {
> +	.read = status_read
> +};

The custom seems to be to use DEFINE_SIMPLE_ATTRIBUTE() that can handle concurrent
reads on an open file descriptor.

> +
> +static void make_status_files(struct dentry *dir, struct event_group *e, int pkg, int instance)
> +{
> +	void *info = (void __force *)e->pkginfo[pkg]->addrs[instance] + e->mmio_size;
> +	char name[64];
> +
> +	sprintf(name, "%s_pkg%d_agg%d_data_loss_count", e->name, pkg, instance);
> +	debugfs_create_file(name, 0400, dir, info - 24, &status_fops);
> +
> +	sprintf(name, "%s_pkg%d_agg%d_data_loss_timestamp", e->name, pkg, instance);
> +	debugfs_create_file(name, 0400, dir, info - 16, &status_fops);
> +
> +	sprintf(name, "%s_pkg%d_agg%d_last_update_timestamp", e->name, pkg, instance);
> +	debugfs_create_file(name, 0400, dir, info - 8, &status_fops);
> +}
> +
> +static void create_debug_event_status_files(struct dentry *dir, struct event_group *e)
> +{
> +	int num_pkgs = topology_max_packages();
> +
> +	for (int i = 0; i < num_pkgs; i++)
> +		for (int j = 0; j < e->pkginfo[i]->num_regions; j++)
> +			make_status_files(dir, e, i, j);
> +}
> +
> +static void create_debugfs_status_file(struct rdt_resource *r)
> +{
> +	struct event_group **eg;
> +	struct dentry *infodir;
> +
> +	infodir = resctrl_debugfs_mon_info_arch_mkdir(r);

I understand that debugfs guidance is that callers should ignore errors returned by
debugfs_create_dir(). Even so, I think it is unnecessary to do so when so much unnecessary
work can be avoided when, for example, debugfs is disabled. I see no harm in bailing
out here if "infodir" cannot be created. If it can be created the other debugfs calls
are likely to succeed so further checking should not be necessary. 
No strong objection from my side if you prefer to keep it like this ... considering that
it is indeed following debugfs guidance.

> +	for (eg = &known_event_groups[0]; eg < &known_event_groups[NUM_KNOWN_GROUPS]; eg++) {
> +		if (!(*eg)->pfg)
> +			continue;
> +		create_debug_event_status_files(infodir, *eg);
> +	}
> +}
> +
>  /*
>   * Ask OOBMSM discovery driver for all the RMID based telemetry groups
>   * that it supports.
> @@ -300,6 +353,9 @@ bool intel_aet_get_events(void)
>  		r->mon_capable = true;
>  	}
>  
> +	if (ret1 || ret2)
> +		create_debugfs_status_file(r);
> +
>  	return ret1 || ret2;
>  }
>  

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ