[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4085f4f3-7a3e-8af8-1ae3-1040ca78f59f@intel.com>
Date: Wed, 9 Aug 2023 15:41:01 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: James Morse <james.morse@....com>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>
CC: Fenghua Yu <fenghua.yu@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Babu Moger <Babu.Moger@....com>,
<shameerali.kolothum.thodi@...wei.com>,
D Scott Phillips OS <scott@...amperecomputing.com>,
<carl@...amperecomputing.com>, <lcherian@...vell.com>,
<bobo.shaobowang@...wei.com>, <tan.shaopeng@...itsu.com>,
<xingxin.hx@...nanolis.org>, <baolin.wang@...ux.alibaba.com>,
Jamie Iles <quic_jiles@...cinc.com>,
Xin Hao <xhao@...ux.alibaba.com>, <peternewman@...gle.com>,
<dfustini@...libre.com>
Subject: Re: [PATCH v5 24/24] x86/resctrl: Separate arch and fs resctrl locks
Hi James,
On 7/28/2023 9:42 AM, James Morse wrote:
> resctrl has one mutex that is taken by the architecture specific code,
> and the filesystem parts. The two interact via cpuhp, where the
> architecture code updates the domain list. Filesystem handlers that
> walk the domains list should not run concurrently with the cpuhp
> callback modifying the list.
>
> Exposing a lock from the filesystem code means the interface is not
> cleanly defined, and creates the possibility of cross-architecture
> lock ordering headaches. The interaction only exists so that certain
> filesystem paths are serialised against cpu hotplug. The cpu hotplug
cpu hotplug -> CPU hotplug
> code already has a mechanism to do this using cpus_read_lock().
>
> MPAM's monitors have an overflow interrupt, so it needs to be possible
> to walk the domains list in irq context. RCU is ideal for this,
> but some paths need to be able to sleep to allocate memory.
>
> Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part
> of a cpuhp callback, cpus_read_lock() must always be taken first.
> rdtgroup_schemata_write() already does this.
>
> Most of the filesystem code's domain list walkers are currently
> protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live().
> The exceptions are rdt_bit_usage_show() and the mon_config helpers
> which take the lock directly.
>
> Make the domain list protected by RCU. An architecture-specific
> lock prevents concurrent writers. rdt_bit_usage_show() can
> walk the domain list under rcu_read_lock(). The mon_config helpers
> send multiple IPIs, take the cpus_read_lock() in these cases.
>
> The other filesystem list walkers need to be able to sleep.
> Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the
> cpuhp callbacks can't be invoked when file system operations are
> occurring.
>
> Add lockdep_assert_cpus_held() in the cases where the
> rdtgroup_kn_lock_live() call isn't obvious.
>
> Resctrl's domain online/offline calls now need to take the
> rdtgroup_mutex themselves.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@...itsu.com>
> Signed-off-by: James Morse <james.morse@....com>
...
> @@ -464,6 +467,9 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
> bool sep = false;
> u32 ctrl_val;
>
> + /* Walking r->domains, ensure it can't race with cpuhp */
> + lockdep_assert_cpus_held();
> +
> seq_printf(s, "%*s:", max_name_width, schema->name);
> list_for_each_entry(dom, &r->domains, list) {
> if (sep)
> @@ -534,8 +540,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> {
> int cpu;
>
> - /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */
> - lockdep_assert_held(&rdtgroup_mutex);
> + /* When picking a cpu from cpu_mask, ensure it can't race with cpuhp */
cpu -> CPU
> + lockdep_assert_cpus_held();
>
> /*
> * Setup the parameters to pass to mon_event_count() to read the data.
...
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index a256a96df487..47dcf2cb76ca 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -35,6 +35,10 @@
> DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
> DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key);
> DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
> +
> +/* Mutex to protect rdtgroup access. */
> +DEFINE_MUTEX(rdtgroup_mutex);
> +
> static struct kernfs_root *rdt_root;
> struct rdtgroup rdtgroup_default;
> LIST_HEAD(rdt_all_groups);
> @@ -954,7 +958,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>
> mutex_lock(&rdtgroup_mutex);
> hw_shareable = r->cache.shareable_bits;
> - list_for_each_entry(dom, &r->domains, list) {
> + rcu_read_lock();
> + list_for_each_entry_rcu(dom, &r->domains, list) {
> if (sep)
> seq_putc(seq, ';');
> sw_shareable = 0;
Does rdt_bit_usage_show() really need RCU? It is another filesystem callback and I
do not see a reason why it should access the domain list in a different way. It
can follow the same pattern as all the other resctrl filesystem ops and use
cpus_read_lock().
> @@ -1010,8 +1015,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
> }
> sep = true;
> }
> + rcu_read_unlock();
> seq_putc(seq, '\n');
> mutex_unlock(&rdtgroup_mutex);
> +
Unnecessary empty line.
> return 0;
> }
Reinette
Powered by blists - more mailing lists