[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230525180209.19497-1-james.morse@arm.com>
Date: Thu, 25 May 2023 18:01:45 +0000
From: James Morse <james.morse@....com>
To: x86@...nel.org, linux-kernel@...r.kernel.org
Cc: Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Babu Moger <Babu.Moger@....com>,
James Morse <james.morse@....com>,
shameerali.kolothum.thodi@...wei.com,
D Scott Phillips OS <scott@...amperecomputing.com>,
carl@...amperecomputing.com, lcherian@...vell.com,
bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
Jamie Iles <quic_jiles@...cinc.com>,
Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com,
dfustini@...libre.com
Subject: [PATCH v4 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking
Hello!
Changes since v3 are to split the patches up at the end of the series a little
more, and to make resctrl_arch_rmid_read() re-entrant which is needed before a
nohz_full CPU can call it in process context, and then be IPI'd from another
CPU. Otherwise changes since v3 are noted in each patch.
~
This series does two things, it changes resctrl to call resctrl_arch_rmid_read()
in a way that works for MPAM, and it separates the locking so that the arch code
and filesystem code don't have to share a mutex. I tried to split this as two
series, but these touch similar call sites, so it would create more work.
(What's MPAM? See the cover letter of the first series. [1])
On x86 the RMID is an independent number. MPAMs equivalent is PMG, but this
isn't an independent number - it extends the PARTID (same as CLOSID) space
with bits that aren't used to select the configuration. The monitors can
then be told to match specific PMG values, allowing monitor-groups to be
created.
But, MPAM expects the monitors to always monitor by PARTID. The
Cache-storage-utilisation counters can only work this way.
(In the MPAM spec not setting the MATCH_PARTID bit is made CONSTRAINED
UNPREDICTABLE - which is Arm's term to mean portable software can't rely on
this)
It gets worse, as some SoCs may have very few PMG bits. I've seen the
datasheet for one that has a single bit of PMG space.
To be usable, MPAM's counters always need the PARTID and the PMG.
For resctrl, this means always making the CLOSID available when the RMID
is used.
To ensure RMID are always unique, this series combines the CLOSID and RMID
into an index, and manages RMID based on that. For x86, the index and RMID
would always be the same.
Currently the architecture specific code in the cpuhp callbacks takes the
rdtgroup_mutex. This means the filesystem code would have to export this
lock, resulting in an ill-defined interface between the two, and the possibility
of cross-architecture lock-ordering head aches.
The second part of this series adds a domain_list_lock to protect writes to the
domain list, and protects the domain list with RCU - or read_cpus_lock().
Use of RCU is to allow lockless readers of the domain list, today resctrl only has
one, rdt_bit_usage_show(). But to get MPAMs monitors working, its very likely
they'll need to be plumbed up to perf. The uncore PMU driver would be a second
lockless reader of the domain list.
This series is based on v6.4-rc1, and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/monitors_and_locking/v4
Bugs welcome,
Thanks,
James
[1] https://lore.kernel.org/lkml/20210728170637.25610-1-james.morse@arm.com/
[v1] https://lore.kernel.org/all/20221021131204.5581-1-james.morse@arm.com/
[v2] https://lore.kernel.org/lkml/20230113175459.14825-1-james.morse@arm.com/
[v3] https://lore.kernel.org/r/20230320172620.18254-1-james.morse@arm.com
James Morse (24):
x86/resctrl: Track the closid with the rmid
x86/resctrl: Access per-rmid structures by index
x86/resctrl: Create helper for RMID allocation and mondata dir
creation
x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare()
x86/resctrl: Allow RMID allocation to be scoped by CLOSID
x86/resctrl: Track the number of dirty RMID a CLOSID has
x86/resctrl: Use set_bit()/clear_bit() instead of open coding
x86/resctrl: Allocate the cleanest CLOSID by searching
closid_num_dirty_rmid
x86/resctrl: Move CLOSID/RMID matching and setting to use helpers
tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef
x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow
x86/resctrl: Make resctrl_arch_rmid_read() retry when it is
interrupted
x86/resctrl: Queue mon_event_read() instead of sending an IPI
x86/resctrl: Allow resctrl_arch_rmid_read() to sleep
x86/resctrl: Allow arch to allocate memory needed in
resctrl_arch_rmid_read()
x86/resctrl: Make resctrl_mounted checks explicit
x86/resctrl: Move alloc/mon static keys into helpers
x86/resctrl: Make rdt_enable_key the arch's decision to switch
x86/resctrl: Add helpers for system wide mon/alloc capable
x86/resctrl: Add cpu online callback for resctrl work
x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but
cpu
x86/resctrl: Add cpu offline callback for resctrl work
x86/resctrl: Move domain helper migration into resctrl_offline_cpu()
x86/resctrl: Separate arch and fs resctrl locks
arch/x86/include/asm/resctrl.h | 90 +++++
arch/x86/kernel/cpu/resctrl/core.c | 78 ++--
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 45 ++-
arch/x86/kernel/cpu/resctrl/internal.h | 81 ++++-
arch/x86/kernel/cpu/resctrl/monitor.c | 414 ++++++++++++++++------
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 15 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 335 ++++++++++++-----
include/linux/resctrl.h | 42 ++-
include/linux/tick.h | 9 +-
9 files changed, 846 insertions(+), 263 deletions(-)
--
2.39.2
Powered by blists - more mailing lists