lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHCEFEyd0Y+wTrLWNMUNvwgJrCxAi66D17w3Zg-ikH5005k1-w@mail.gmail.com>
Date: Tue, 29 Jul 2025 15:53:27 +0800
From: Hc Zheng <zhenghc00@...il.com>
To: Fenghua Yu <fenghua.yu@...el.com>, Reinette Chatre <reinette.chatre@...el.com>, 
	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>
Cc: x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: [Bug] x86/resctrl: unexpect mbm_local_bytes/mbm_total_bytes delta on
 AMD with multiple RMIDs in the same domain

Hi All,

We have enable resctrl on container platform. We notice some unexpect
behaviors when multiple containers running in the same L3 domain.
the  mbm_local_bytes/mbm_total_bytes for such mon_groups return
Unavailable or delta with two consecutive reads is out of normal range
(eg: 1000+GB/s)

after reading the AMD pqos manual(), it says
"""
Potential causes of the “U” bit being set include
(but are not limited to):

• RMID is not currently tracked by the hardware.
• RMID was not tracked by the hardware at some time since it was last read.
• RMID has not been read since it started being tracked by the hardware.
"""

but no explanations for unexpect large delta between 2 reads of the
counters. After exam the kernel code, I suspect this would more likely
to be a hardware bugs

here are the steps to reproduce it

1. create mon_groups

$ for i in `seq 0 99`;do mkdir -p /sys/fs/resctrl/amdtest/mon_groups/test$i;done

2. run stress command and assigned such pid to each mon_groups , (I
have run such test on AMD Genoa. cpu 16-23,208-215 is on CCD 8)

$ cat stress.sh
nohup numactl -C 16-23,208-215 stress -m  1 --vm-hang 1 > /dev/null &
lastPid=$!
echo $lastPid > /sys/fs/resctrl/amdtest/tasks
echo $lastPid > /sys/fs/resctrl/amdtest/mon_groups/test$1/tasks
$ for i in `seq 0 99`;do bash stress.sh $i ;done

3. watch the resctrl counter every 10 seconds

$ while true ;do cat
/sys/fs/resctrl/amdtest/mon_groups/test9/mon_data/mon_L3_08/mbm_local_bytes;sleep
10;done

...
Unavailable
Unavailable
Unavailable
61924495182825856
64176294690029568
Unavailable
Unavailable
Unavailable
...

at some point the delta for 2 consecutive reads is out of normal
range,  (64176294690029568 - 61924495182825856) / 1024 / 1024 / 1024 /
10 =  209715 Gb/s

if I lower the concurrecy to like 59 or lower, the delta is in normal
range, and never return Unavailable. I have also tested on amd Rome
cpu, the problem still existed.
I have try this on intel platform, It does not have such problem, with
even over 200+ RMIDs concurrently being monitored.

I can not find any documents about max RMID for AMD hardware can
concurrently holds, or a explanations for such problems.
I believe this could become even severe on AMD with more threads in
the future, as we will run more workloads on a single server

Can some one help me to solve this problem, thanks


Best Regards
Huaicheng Zheng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ