[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f762f6a9-74cc-0299-0bea-6d1ab6c88e41@linux.intel.com>
Date: Wed, 6 Aug 2025 12:17:07 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
cc: Hans de Goede <hdegoede@...hat.com>, platform-driver-x86@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] platform/x86/intel-uncore-freq: Present unique domain
ID per package
On Sun, 27 Jul 2025, Srinivas Pandruvada wrote:
> In partitioned systems, the domain ID is unique in the partition and a
> package can have multiple partitions.
>
> Some user-space tools, such as turbostat, assume the domain ID is unique
> per package. These tools map CPU power domains, which are unique to a
> package. However, this approach does not work in partitioned systems.
>
> There is no architectural definition of "partition" to present to user
> space.
>
> To support these tools, set the domain_id to be unique per package. For
> compute die IDs, uniqueness can be achieved using the platform info
> cdie_mask, mirroring the behavior observed in non-partitioned systems.
>
> For IO dies, which lack a direct CPU relationship, any unique logical
> ID can be assigned. Here domain IDs for IO dies are configured after all
> compute domain IDs. During the probe, keep the index of the next IO
> domain ID after the last IO domain ID of the current partition. Since
> CPU packages are symmetric, partition information is same for all
> packages.
>
> The Intel Speed Select driver has already implemented a similar change
> to make the domain ID unique, with compute dies listed first, followed
> by I/O dies.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> ---
> Rebased on
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
> for-next
>
> .../uncore-frequency/uncore-frequency-tpmi.c | 42 +++++++++++++++++++
> 1 file changed, 42 insertions(+)
>
> diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> index bfcf92aa4d69..563e215b4076 100644
> --- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> +++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
> @@ -411,6 +411,47 @@ static int uncore_read(struct uncore_data *data, unsigned int *value, enum uncor
> return -EOPNOTSUPP;
> }
>
> +#define MAX_PARTITIONS 2
> +
> +/* IO domain ID start index for a partition */
> +static u8 io_die_start[MAX_PARTITIONS];
> +
> +/* Next IO domain ID index after the current partition IO die IDs */
> +static u8 io_die_index_next;
> +
> +/* Lock to protect io_die_start, io_die_index_next */
> +static DEFINE_MUTEX(domain_lock);
> +
> +static void update_domain_id(struct tpmi_uncore_cluster_info *cluster_info,
> + struct oobmsm_plat_info *plat_info, int num_resource)
> +{
> + u8 max_dies = topology_max_dies_per_package();
> + u8 cdie_cnt;
> +
> + /* For non partitioned system or invalid partition number, return */
non-partitioned
> + if (!plat_info->cdie_mask || max_dies <= 1 || plat_info->partition >= MAX_PARTITIONS)
> + return;
> +
> + if (cluster_info->uncore_data.agent_type_mask & AGENT_TYPE_CORE) {
> + cluster_info->uncore_data.domain_id = cluster_info->cdie_id;
> + return;
> + }
> +
> + cdie_cnt = fls(plat_info->cdie_mask) - ffs(plat_info->cdie_mask) + 1;
Is it intentional that you didn't use hweight here? (unfortunately,
I don't recall details of the cdie_mask).
> + guard(mutex)(&domain_lock);
> +
> + if (!io_die_index_next)
> + io_die_index_next = max_dies;
> +
> + if (!io_die_start[plat_info->partition]) {
> + io_die_start[plat_info->partition] = io_die_index_next;
> + io_die_index_next += (num_resource - cdie_cnt);
> + }
> +
> + cluster_info->uncore_data.domain_id += (io_die_start[plat_info->partition] - cdie_cnt);
I failed to wrap my head around what this math aims to do (mainly what
cdie_cnt has to do with this). Can you explain (might be useful to have a
comment if it's something particularly tricky / non-obvious)?
It could be that to make this simpler, one shouldn't assign value in
uncore_probe() to .domain_id at all but pass the index here (and rename
this function to set_domain_id()).
> +}
> +
> /* Callback for sysfs write for TPMI uncore data. Called under mutex locks. */
> static int uncore_write(struct uncore_data *data, unsigned int value, enum uncore_index index)
> {
> @@ -614,6 +655,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
> cluster_info->uncore_data.cluster_id = j;
>
> set_cdie_id(i, cluster_info, plat_info);
> + update_domain_id(cluster_info, plat_info, num_resources);
>
> cluster_info->uncore_root = tpmi_uncore;
>
>
--
i.
Powered by blists - more mailing lists