lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a148b764-0609-433e-b6e9-932493f6c1b1@intel.com>
Date: Tue, 29 Jul 2025 09:49:43 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Hc Zheng <zhenghc00@...il.com>, Fenghua Yu <fenghua.yu@...el.com>, "Thomas
 Gleixner" <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, "Borislav
 Petkov" <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, Babu Moger
	<babu.moger@....com>
CC: <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [Bug] x86/resctrl: unexpect mbm_local_bytes/mbm_total_bytes delta
 on AMD with multiple RMIDs in the same domain

+Babu

Hi Huaicheng Zheng,

On 7/29/25 12:53 AM, Hc Zheng wrote:
> Hi All,
> 
> We have enable resctrl on container platform. We notice some unexpect
> behaviors when multiple containers running in the same L3 domain.
> the  mbm_local_bytes/mbm_total_bytes for such mon_groups return
> Unavailable or delta with two consecutive reads is out of normal range
> (eg: 1000+GB/s)
> 
> after reading the AMD pqos manual(), it says
> """
> Potential causes of the “U” bit being set include
> (but are not limited to):
> 
> • RMID is not currently tracked by the hardware.
> • RMID was not tracked by the hardware at some time since it was last read.
> • RMID has not been read since it started being tracked by the hardware.
> """
> 
> but no explanations for unexpect large delta between 2 reads of the
> counters. After exam the kernel code, I suspect this would more likely
> to be a hardware bugs
> 
> here are the steps to reproduce it
> 
> 1. create mon_groups
> 
> $ for i in `seq 0 99`;do mkdir -p /sys/fs/resctrl/amdtest/mon_groups/test$i;done
> 
> 2. run stress command and assigned such pid to each mon_groups , (I
> have run such test on AMD Genoa. cpu 16-23,208-215 is on CCD 8)
> 
> $ cat stress.sh
> nohup numactl -C 16-23,208-215 stress -m  1 --vm-hang 1 > /dev/null &
> lastPid=$!
> echo $lastPid > /sys/fs/resctrl/amdtest/tasks
> echo $lastPid > /sys/fs/resctrl/amdtest/mon_groups/test$1/tasks
> $ for i in `seq 0 99`;do bash stress.sh $i ;done
> 
> 3. watch the resctrl counter every 10 seconds
> 
> $ while true ;do cat
> /sys/fs/resctrl/amdtest/mon_groups/test9/mon_data/mon_L3_08/mbm_local_bytes;sleep
> 10;done
> 
> ...
> Unavailable
> Unavailable
> Unavailable
> 61924495182825856
> 64176294690029568
> Unavailable
> Unavailable
> Unavailable
> ...
> 
> at some point the delta for 2 consecutive reads is out of normal
> range,  (64176294690029568 - 61924495182825856) / 1024 / 1024 / 1024 /
> 10 =  209715 Gb/s
> 
> if I lower the concurrecy to like 59 or lower, the delta is in normal
> range, and never return Unavailable. I have also tested on amd Rome
> cpu, the problem still existed.
> I have try this on intel platform, It does not have such problem, with
> even over 200+ RMIDs concurrently being monitored.
> 
> I can not find any documents about max RMID for AMD hardware can
> concurrently holds, or a explanations for such problems.
> I believe this could become even severe on AMD with more threads in
> the future, as we will run more workloads on a single server
> 
> Can some one help me to solve this problem, thanks

It looks to me as though you are encountering the issue that is addressed with AMD's
Assignable Bandwidth Monitoring Counters (ABMC) feature that Babu is currently enabling
in resctrl [1]. The feature itself is well documented in that series and includes links to
the AMD spec where you can learn more.
You show that the "Unavailable" is encountered when reading these counters from user
space and I deduce from that that resctrl's internal MBM overflow handler (it runs once
per second) likely encounters the same error with the consequence that overflows of the
counter are not handled correctly.

If you do have access to the AMD hardware with this feature, please do take a look at
the resctrl support for it and try it out. We would all appreciate your feedback to ensure
resctrl supports it well.

Reinette 

[1] https://lore.kernel.org/lkml/cover.1753467772.git.babu.moger@amd.com/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ