linux-kernel - Re: [PATCH] platform/x86/intel-uncore-freq: Present unique domain ID per package

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f762f6a9-74cc-0299-0bea-6d1ab6c88e41@linux.intel.com>
Date: Wed, 6 Aug 2025 12:17:07 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
cc: Hans de Goede <hdegoede@...hat.com>, platform-driver-x86@...r.kernel.org, 
    LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] platform/x86/intel-uncore-freq: Present unique domain
 ID per package

On Sun, 27 Jul 2025, Srinivas Pandruvada wrote:

> In partitioned systems, the domain ID is unique in the partition and a
> package can have multiple partitions.
> 
> Some user-space tools, such as turbostat, assume the domain ID is unique
> per package. These tools map CPU power domains, which are unique to a
> package. However, this approach does not work in partitioned systems.
> 
> There is no architectural definition of "partition" to present to user
> space.
> 
> To support these tools, set the domain_id to be unique per package. For
> compute die IDs, uniqueness can be achieved using the platform info
> cdie_mask, mirroring the behavior observed in non-partitioned systems.
> 
> For IO dies, which lack a direct CPU relationship, any unique logical
> ID can be assigned. Here domain IDs for IO dies are configured after all
> compute domain IDs. During the probe, keep the index of the next IO
> domain ID after the last IO domain ID of the current partition. Since
> CPU packages are symmetric, partition information is same for all
> packages.
> 
> The Intel Speed Select driver has already implemented a similar change
> to make the domain ID unique, with compute dies listed first, followed
> by I/O dies.
> 
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> ---
> Rebased on
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
> for-next
> 
>  .../uncore-frequency/uncore-frequency-tpmi.c  | 42 +++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> index bfcf92aa4d69..563e215b4076 100644
> --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> @@ -411,6 +411,47 @@ static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncor
>  	return -EOPNOTSUPP;
>  }
>  
> +#define MAX_PARTITIONS	2
> +
> +/* IO domain ID start index for a partition */
> +static u8 io_die_start[MAX_PARTITIONS];
> +
> +/* Next IO domain ID index after the current partition IO die IDs */
> +static u8 io_die_index_next;
> +
> +/* Lock to protect io_die_start, io_die_index_next */
> +static DEFINE_MUTEX(domain_lock);
> +
> +static void update_domain_id(struct tpmi_uncore_cluster_info *cluster_info,
> +			     struct oobmsm_plat_info *plat_info, int num_resource)
> +{
> +	u8 max_dies = topology_max_dies_per_package();
> +	u8 cdie_cnt;
> +
> +	/* For non partitioned system or invalid partition number, return */

non-partitioned

> +	if (!plat_info->cdie_mask || max_dies <= 1 || plat_info->partition >= MAX_PARTITIONS)
> +		return;
> +
> +	if (cluster_info->uncore_data.agent_type_mask & AGENT_TYPE_CORE) {
> +		cluster_info->uncore_data.domain_id = cluster_info->cdie_id;
> +		return;
> +	}
> +
> +	cdie_cnt = fls(plat_info->cdie_mask) - ffs(plat_info->cdie_mask) + 1;

Is it intentional that you didn't use hweight here? (unfortunately,
I don't recall details of the cdie_mask).

> +	guard(mutex)(&domain_lock);
> +
> +	if (!io_die_index_next)
> +		io_die_index_next = max_dies;
> +
> +	if (!io_die_start[plat_info->partition]) {
> +		io_die_start[plat_info->partition] = io_die_index_next;
> +		io_die_index_next += (num_resource - cdie_cnt);
> +	}
> +
> +	cluster_info->uncore_data.domain_id += (io_die_start[plat_info->partition] - cdie_cnt);

I failed to wrap my head around what this math aims to do (mainly what 
cdie_cnt has to do with this). Can you explain (might be useful to have a 
comment if it's something particularly tricky / non-obvious)?

It could be that to make this simpler, one shouldn't assign value in 
uncore_probe() to .domain_id at all but pass the index here (and rename 
this function to set_domain_id()).

> +}
> +
>  /* Callback for sysfs write for TPMI uncore data. Called under mutex locks. */
>  static int uncore_write(struct uncore_data *data, unsigned int value, enum uncore_index index)
>  {
> @@ -614,6 +655,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
>  			cluster_info->uncore_data.cluster_id = j;
>  
>  			set_cdie_id(i, cluster_info, plat_info);
> +			update_domain_id(cluster_info, plat_info, num_resources);
>  
>  			cluster_info->uncore_root = tpmi_uncore;
>  
> 

-- 
 i.