linux-kernel - Re: [PATCH v3 18/19] x86/resctrl: Add cpu offline callback for resctrl work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9282d346-80eb-88bd-b9d0-0c3dd16d51db@arm.com>
Date:   Thu, 27 Apr 2023 15:20:31 +0100
From:   James Morse <james.morse@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Babu Moger <Babu.Moger@....com>,
        shameerali.kolothum.thodi@...wei.com,
        D Scott Phillips OS <scott@...amperecomputing.com>,
        carl@...amperecomputing.com, lcherian@...vell.com,
        bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
        xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
        Jamie Iles <quic_jiles@...cinc.com>,
        Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com
Subject: Re: [PATCH v3 18/19] x86/resctrl: Add cpu offline callback for
 resctrl work

Hi Reinette,

On 06/04/2023 00:48, Reinette Chatre wrote:
> On 3/20/2023 10:26 AM, James Morse wrote:
> 
>> -static int resctrl_offline_cpu(unsigned int cpu)
>> -{
>> -	struct rdtgroup *rdtgrp;
>>  	struct rdt_resource *r;
>>  
>>  	mutex_lock(&rdtgroup_mutex);
>> +	resctrl_offline_cpu(cpu);
>> +
>>  	for_each_capable_rdt_resource(r)
>>  		domain_remove_cpu(cpu, r);
>> -	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
>> -		if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) {
>> -			clear_childcpus(rdtgrp, cpu);
>> -			break;
>> -		}
>> -	}
>>  	clear_closid_rmid(cpu);
>>  	mutex_unlock(&rdtgroup_mutex);
>>  
> 
> I find this and the previous patch to be very complicated.

It consolidates the parts of this that have nothing to do with the architecture specific code.
The extra work is because the semantics are: "this CPU is going away", the callee needs to
to not pick 'this CPU' again when updating any structures.

Ensuring the structures have not yet been modified by the architecture code is the
cleanest interface as there is nothing special about what the arch code provides to the
filesystem here.

I agree it looks like a special case, but only because the existing code is being called
halfway through the tear down, and depends on what the arch code has already done.

Having a single call, where nothing has been changed yet is the most maintainable option
as it avoids extra hooks, or an incomplete list of what has been torn down, and what
hasn't - some of which may be architecture specific.

It also avoids any interaction with how the architecture code chooses to prevent multiple
writers to the domain list - I don't want any of the filesystem code to depend on a lock
held by the architecture specific code.

> It is not clear
> to me why resctrl_offline_cpu(cpu) is required to be before offline of domain.
> Previous patch would not be needed if the existing order of operations
> is maintained.

The existing order is a bit of a soup.

You'd need a resctrl_domain_rebalance_helpers() to move the limbo and mbm workers, but
this would run after the CPU had been removed from the domain. Hopefully the name conveys
that it doesn't always run when a CPU is going offline.
resctrl_offline_cpu() would potentially run after the CPUs domains have been free()d,
depending on what gets added in the future this might be a problem, leading to a
resctrl_pre_offline_cpu() hook.

I worry this strange state leads to extra special-case'd filesystem code, and extra hooks.

I can split the consolidation of the filesystem code up in this patch, the
clear_childcpus() and limbo/mbm stuff can be done in separate patches, which might make it
easier on the eye.

Thanks,

James