lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 13 Mar 2024 11:23:02 +0100
From: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
CC: Fenghua Yu <fenghua.yu@...el.com>, Reinette Chatre
	<reinette.chatre@...el.com>, Shuah Khan <shuah@...nel.org>,
	<tony.luck@...el.com>, "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>,
	LKML <linux-kernel@...r.kernel.org>, <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH 2/4] selftests/resctrl: SNC support for CMT

On 2024-03-08 at 15:59:02 +0200, Ilpo Järvinen wrote:
>On Fri, 8 Mar 2024, Ilpo Järvinen wrote:
>
>> On Wed, 6 Mar 2024, Maciej Wieczor-Retman wrote:
>> 
>> > Cache Monitoring Technology (CMT) works by measuring how much data in L3
>> > cache is occupied by a given process identified by its Resource
>> > Monitoring ID (RMID).
>> > 
>> > On systems with Sub-Numa Clusters (SNC) enabled, a process can occupy
>> > not only the cache that belongs to its own NUMA node but also pieces of
>> > other NUMA nodes' caches that lie on the same socket.
>> > 
>> > A simple correction to make the CMT selftest NUMA-aware is to sum values
>> > reported by all nodes on the same socket for a given RMID.
>> > 
>> > Reported-by: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>
>> > Closes: https://lore.kernel.org/all/TYAPR01MB6330B9B17686EF426D2C3F308B25A@TYAPR01MB6330.jpnprd01.prod.outlook.com/
>> > Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
>> > ---
>
>> > @@ -828,6 +828,8 @@ int resctrl_val(const struct resctrl_test *test,
>> >  	sleep(1);
>> >  
>> >  	/* Test runs until the callback setup() tells the test to stop. */
>> > +	get_domain_id("L3", uparams->cpu, &res_id);
>> 
>> Hardcoding L3 here limits the genericness of this function. You don't even 
>> need to do it, get_domain_id() does "MB" -> "L3" transformation implicitly 
>> for you so you can just pass test->resource instead.
>> 
>> Also, I don't understand why you now again make the naming inconsistent 
>> with "res_id".
>> 
>> If you based this on top of the patches I just posted, resctl_val() 
>> already the domain_id variable.
>
>Ah, I retract what I said. I see you actually want it only from L3.
>
>> > +     res_id *= snc_ways();
>
>I don't understand what this is trying to achieve and how.

We exchanged some private messages on this but I'll post the explanation here
too if anyone else was looking for it.

get_domain_id("L3"...) essentially gives us the number of the socket (on
platforms that have one L3 cache per socket - I still have too look into other
ones). The problem here is that to get an accurate reading with SNC enabled we
need to collect values from all nodes on a single socket that has the CPU our
test is running on. So we need to find the first node on that socket so then we
can loop through all the nodes on that socket. To do that we multiply res_id by
the amount of SNC nodes per socket (res_id *= snc_ways()) and that's it.

I'll add some helper with an explaining comment on what it does in the next
version.

>
>-- 
> i.


-- 
Kind regards
Maciej Wieczór-Retman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ