lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c06dbac5-f849-4259-b395-83ba1c98c557@shopee.com>
Date: Thu, 29 Feb 2024 11:16:04 +0800
From: Haifeng Xu <haifeng.xu@...pee.com>
To: Reinette Chatre <reinette.chatre@...el.com>,
 James Morse <james.morse@....com>
Cc: fenghua.yu@...el.com, babu.moger@....com, peternewman@...gle.com,
 x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] x86/resctrl: Add tracepoint for llc_occupancy
 tracking



On 2024/2/24 03:41, Reinette Chatre wrote:
> (+James)
> 
> Hi Haifeng and James,
> 
> On 2/21/2024 1:21 AM, Haifeng Xu wrote:
>> In our production environment, after removing monitor groups, those unused
>> RMIDs get stuck in the limbo list forever because their llc_occupancy are
>> always larger than the threshold. But the unused RMIDs can be successfully
>> freed by turning up the threshold.
>>
>> In order to know how much the threshold should be, the following steps can
>> be taken to acquire the llc_occupancy of RMIDs in each rdt domain:
>>
>> 1) perf probe -a '__rmid_read eventid rmid'
>>    perf probe -a '__rmid_read%return $retval'
>> 2) perf record -e probe:__rmid_read -e probe:__rmid_read__return -aR sleep 10
>> 3) perf script > __rmid_read.txt
>> 4) cat __rmid_read.txt | grep "eventid=0x1 " -A 1 | grep "kworker" > llc_occupnacy.txt
>>
> 
> The details on how perf can be used was useful during the discussion of this
> work but can be omitted from this changelog.
 
Got it.

> 
>> Instead of using perf tool to track llc_occupancy and filter the log manually,
>> it is more convenient for users to use tracepoint to do this work. So add a new
>> tracepoint that shows the llc_occupancy of busy RMIDs when scanning the limbo
>> list.
>>
>> Signed-off-by: Haifeng Xu <haifeng.xu@...pee.com>
>> Suggested-by: Reinette Chatre <reinette.chatre@...el.com>
>> ---
>>  arch/x86/kernel/cpu/resctrl/monitor.c |  2 ++
>>  arch/x86/kernel/cpu/resctrl/trace.h   | 13 +++++++++++++
>>  2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index f136ac046851..1533b1932b49 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -23,6 +23,7 @@
>>  #include <asm/resctrl.h>
>>  
>>  #include "internal.h"
>> +#include "trace.h"
>>  
>>  struct rmid_entry {
>>  	u32				rmid;
>> @@ -302,6 +303,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
>>  			}
>>  		}
>>  		crmid = nrmid + 1;
>> +		trace_mon_llc_occupancy_limbo(nrmid, d->id, val);
> 
> This area recently received some changes (you can find the latest on the
> x86/cache branch of the tip repo). Please see [1] for a good
> description of the new "index". For this tracing to be useful to MPAM
> I thus expect that the tracepoint will need to print the MPAM equivalent
> to CLOSID, the PARTID. We can refer to this CLOSID/PARTID value as
> "ctrl_hw_id".
> 
> This snippet can then change to use the new resctrl_arch_rmid_idx_decode()
> to learn the "ctrl_hw_id" and "mon_hw_id" and print it as part of
> tracepoint:
> "ctrl_hw_id=%u mon_hw_id=%u domain=%d llc_occupancy=%llu"

OK, I'll post a new patch based on tip repo.

> 
> This will be filesystem code so it cannot know how an architecture
> treats these numbers. Consequently, this may look strange to x86 users
> when ctrl_hw_id will always be X86_RESCTRL_EMPTY_CLOSID ... but it should
> be clear that it is invalid? 
> 
> James, what do you think? Any thoughts on how MPAM will use the limbo handler
> to understand what information can be useful to the user here?
> 
> Reinette
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_lkml_20240213184438.16675-2D7-2Djames.morse-40arm.com_&d=DwICaQ&c=R1GFtfTqKXCFH-lgEPXWwic6stQkW4U7uVq33mt-crw&r=3uoFsejk1jN2oga47MZfph01lLGODc93n4Zqe7b0NRk&m=Grl-QGKKyzz601g4WQFhPFVML6pju3g8CUGyD2VF8r8BUlO_caHlZMafoTxW9iYc&s=ToJ7E8_Afpnn5zh-c-CVReg4WqM-T0pEgB9hN6ntj1A&e= 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ