[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a148b764-0609-433e-b6e9-932493f6c1b1@intel.com>
Date: Tue, 29 Jul 2025 09:49:43 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Hc Zheng <zhenghc00@...il.com>, Fenghua Yu <fenghua.yu@...el.com>, "Thomas
Gleixner" <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, "Borislav
Petkov" <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, Babu Moger
<babu.moger@....com>
CC: <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [Bug] x86/resctrl: unexpect mbm_local_bytes/mbm_total_bytes delta
on AMD with multiple RMIDs in the same domain
+Babu
Hi Huaicheng Zheng,
On 7/29/25 12:53 AM, Hc Zheng wrote:
> Hi All,
>
> We have enable resctrl on container platform. We notice some unexpect
> behaviors when multiple containers running in the same L3 domain.
> the mbm_local_bytes/mbm_total_bytes for such mon_groups return
> Unavailable or delta with two consecutive reads is out of normal range
> (eg: 1000+GB/s)
>
> after reading the AMD pqos manual(), it says
> """
> Potential causes of the “U” bit being set include
> (but are not limited to):
>
> • RMID is not currently tracked by the hardware.
> • RMID was not tracked by the hardware at some time since it was last read.
> • RMID has not been read since it started being tracked by the hardware.
> """
>
> but no explanations for unexpect large delta between 2 reads of the
> counters. After exam the kernel code, I suspect this would more likely
> to be a hardware bugs
>
> here are the steps to reproduce it
>
> 1. create mon_groups
>
> $ for i in `seq 0 99`;do mkdir -p /sys/fs/resctrl/amdtest/mon_groups/test$i;done
>
> 2. run stress command and assigned such pid to each mon_groups , (I
> have run such test on AMD Genoa. cpu 16-23,208-215 is on CCD 8)
>
> $ cat stress.sh
> nohup numactl -C 16-23,208-215 stress -m 1 --vm-hang 1 > /dev/null &
> lastPid=$!
> echo $lastPid > /sys/fs/resctrl/amdtest/tasks
> echo $lastPid > /sys/fs/resctrl/amdtest/mon_groups/test$1/tasks
> $ for i in `seq 0 99`;do bash stress.sh $i ;done
>
> 3. watch the resctrl counter every 10 seconds
>
> $ while true ;do cat
> /sys/fs/resctrl/amdtest/mon_groups/test9/mon_data/mon_L3_08/mbm_local_bytes;sleep
> 10;done
>
> ...
> Unavailable
> Unavailable
> Unavailable
> 61924495182825856
> 64176294690029568
> Unavailable
> Unavailable
> Unavailable
> ...
>
> at some point the delta for 2 consecutive reads is out of normal
> range, (64176294690029568 - 61924495182825856) / 1024 / 1024 / 1024 /
> 10 = 209715 Gb/s
>
> if I lower the concurrecy to like 59 or lower, the delta is in normal
> range, and never return Unavailable. I have also tested on amd Rome
> cpu, the problem still existed.
> I have try this on intel platform, It does not have such problem, with
> even over 200+ RMIDs concurrently being monitored.
>
> I can not find any documents about max RMID for AMD hardware can
> concurrently holds, or a explanations for such problems.
> I believe this could become even severe on AMD with more threads in
> the future, as we will run more workloads on a single server
>
> Can some one help me to solve this problem, thanks
It looks to me as though you are encountering the issue that is addressed with AMD's
Assignable Bandwidth Monitoring Counters (ABMC) feature that Babu is currently enabling
in resctrl [1]. The feature itself is well documented in that series and includes links to
the AMD spec where you can learn more.
You show that the "Unavailable" is encountered when reading these counters from user
space and I deduce from that that resctrl's internal MBM overflow handler (it runs once
per second) likely encounters the same error with the consequence that overflows of the
counter are not handled correctly.
If you do have access to the AMD hardware with this feature, please do take a look at
the resctrl support for it and try it out. We would all appreciate your feedback to ensure
resctrl supports it well.
Reinette
[1] https://lore.kernel.org/lkml/cover.1753467772.git.babu.moger@amd.com/
Powered by blists - more mailing lists