linux-kernel - Re: [PATCH 1/8] perf/x86/uncore: Save the unit control address of all units

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eb5d91d1-2898-45e0-a2d3-aa5c66155911@linux.intel.com>
Date: Wed, 12 Jun 2024 10:49:50 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>, peterz@...radead.org,
 mingo@...nel.org, linux-kernel@...r.kernel.org
Cc: acme@...nel.org, namhyung@...nel.org, irogers@...gle.com,
 eranian@...gle.com, ak@...ux.intel.com, yunying.sun@...el.com
Subject: Re: [PATCH 1/8] perf/x86/uncore: Save the unit control address of all
 units



On 2024-06-10 6:40 p.m., Tim Chen wrote:
> On Mon, 2024-06-10 at 13:16 -0700, kan.liang@...ux.intel.com wrote:
>> From: Kan Liang <kan.liang@...ux.intel.com>
>>
>> The unit control address of some CXL units may be wrongly calculated
>> under some configuration on a EMR machine.
>>
>> The current implementation only saves the unit control address of the
>> units from the first die, and the first unit of the rest of dies. Perf
>> assumed that the units from the other dies have the same offset as the
>> first die. So the unit control address of the rest of the units can be
>> calculated. However, the assumption is wrong, especially for the CXL
>> units.
>>
>> Introduce an RB tree for each uncore type to save the unit control
>> address and ID information for all the units.
>>
>> Compared with the current implementation, more space is required to save
>> the information of all units. The extra size should be acceptable.
>> For example, on EMR, there are 221 units at most. For a 2-socket machine,
>> the extra space is ~6KB at most.
>>
>> Tested-by: Yunying Sun <yunying.sun@...el.com>
>> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
>> ---
>>  arch/x86/events/intel/uncore_discovery.c | 79 +++++++++++++++++++++++-
>>  arch/x86/events/intel/uncore_discovery.h | 10 +++
>>  2 files changed, 87 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
>> index 9a698a92962a..ce520e69a3c1 100644
>> --- a/arch/x86/events/intel/uncore_discovery.c
>> +++ b/arch/x86/events/intel/uncore_discovery.c
>> @@ -93,6 +93,8 @@ add_uncore_discovery_type(struct uncore_unit_discovery *unit)
>>  	if (!type->box_ctrl_die)
>>  		goto free_type;
>>  
>> +	type->units = RB_ROOT;
>> +
>>  	type->access_type = unit->access_type;
>>  	num_discovered_types[type->access_type]++;
>>  	type->type = unit->box_type;
>> @@ -120,10 +122,59 @@ get_uncore_discovery_type(struct uncore_unit_discovery *unit)
>>  	return add_uncore_discovery_type(unit);
>>  }
>>  
>> +static inline bool unit_less(struct rb_node *a, const struct rb_node *b)
>> +{
>> +	struct intel_uncore_discovery_unit *a_node, *b_node;
>> +
>> +	a_node = rb_entry(a, struct intel_uncore_discovery_unit, node);
>> +	b_node = rb_entry(b, struct intel_uncore_discovery_unit, node);
>> +
>> +	if (a_node->pmu_idx < b_node->pmu_idx)
>> +		return true;
>> +	if (a_node->pmu_idx > b_node->pmu_idx)
>> +		return false;
>> +
>> +	if (a_node->die < b_node->die)
>> +		return true;
>> +	if (a_node->die > b_node->die)
>> +		return false;
>> +
>> +	return 0;
> 
> Will it be better if the rb_node is sorted by id instead
> of pmu_idx+die?

The id and pmu_idx+die can all be used as a key to search the RB tree in
different places.

The id is the physical ID of a unit. The search via id is invoked when
adding a new unit. Perf needs to make sure that the same PMU idx
(logical id) is assigned to the unit with the same physical ID. Because
the units with the same physical ID in different dies share the same PMU.

The pmu_idx+die key is used when setting the cpumask. Please see
intel_uncore_find_discovery_unit_id() in the patch 2. Perf wants to
understand on which dies the given PMU is available.

Since different keys can be used to search the RB tree, I think one of
them has to traverse the whole tree. At the stage of adding a new unit,
the tree is not complete yet. It minimizes the impact of the O(N)
search. So I choose the pmu_idx+die rather than id.

Also, the driver only does once to build the tree and set the cpumask at
driver load time. I think the O(N) should be acceptable here.

Thanks,
Kan

> 
> Then it will be faster for uncore_find_unit() to run in
> O(log(N)) instead of O(N).  Right now it looks like we
> are traversing the whole tree to find the entry with the
> id.
> 
> Tim
> 
>> +}
>> +
>> +static inline struct intel_uncore_discovery_unit *
>> +uncore_find_unit(struct rb_root *root, unsigned int id)
>> +{
>> +	struct intel_uncore_discovery_unit *unit;
>> +	struct rb_node *node;
>> +
>> +	for (node = rb_first(root); node; node = rb_next(node)) {
>> +		unit = rb_entry(node, struct intel_uncore_discovery_unit, node);
>> +		if (unit->id == id)
>> +			return unit;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
> 
>