linux-kernel - Re: [PATCH v4 1/7] x86/resctrl: Create separate domains for control and monitoring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da2c0e45-56d0-e04d-774d-4292d156e1d0@intel.com>
Date:   Mon, 28 Aug 2023 10:05:31 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Tony Luck <tony.luck@...el.com>
CC:     Fenghua Yu <fenghua.yu@...el.com>,
        Peter Newman <peternewman@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <skhan@...uxfoundation.org>, <x86@...nel.org>,
        Shaopeng Tan <tan.shaopeng@...itsu.com>,
        James Morse <james.morse@....com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        Babu Moger <babu.moger@....com>,
        Randy Dunlap <rdunlap@...radead.org>,
        <linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <patches@...ts.linux.dev>
Subject: Re: [PATCH v4 1/7] x86/resctrl: Create separate domains for control
 and monitoring

Hi Tony,

On 8/25/2023 10:05 AM, Tony Luck wrote:
> On Fri, Aug 11, 2023 at 10:29:25AM -0700, Reinette Chatre wrote:
>> On 7/22/2023 12:07 PM, Tony Luck wrote:

>>> Change all places where monitoring functions walk the list of
>>> domains to use the new "mondomains" list instead of the old
>>> "domains" list.
>>
>> I would not refer to it as "the old domains list" as it creates
>> impression that this is being replaced. The changelog makes
>> no mention that domains list will remain and be dedicated to
>> control domains. I think this is important to include in description
>> of this change.
> 
> I've rewritten the entire commit message incorporating your suggestions.
> V6 will be posted soon (after I get some time on an SNC SPR to check
> that it all works!)

I seem to have missed v5.

> 
>>
>>>
>>> Signed-off-by: Tony Luck <tony.luck@...el.com>
>>> ---
>>>  include/linux/resctrl.h                   |  10 +-
>>>  arch/x86/kernel/cpu/resctrl/internal.h    |   2 +-
>>>  arch/x86/kernel/cpu/resctrl/core.c        | 195 +++++++++++++++-------
>>>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |   2 +-
>>>  arch/x86/kernel/cpu/resctrl/monitor.c     |   2 +-
>>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  30 ++--
>>>  6 files changed, 167 insertions(+), 74 deletions(-)
>>>
>>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>>> index 8334eeacfec5..1267d56f9e76 100644
>>> --- a/include/linux/resctrl.h
>>> +++ b/include/linux/resctrl.h
>>> @@ -151,9 +151,11 @@ struct resctrl_schema;
>>>   * @mon_capable:	Is monitor feature available on this machine
>>>   * @num_rmid:		Number of RMIDs available
>>>   * @cache_level:	Which cache level defines scope of this resource
>>> + * @mon_scope:		Scope of this resource if different from cache_level
>>
>> I think this addition should be deferred. As it is here it the "if different
>> from cache_level" also creates many questions (when will it be different?
>> how will it be determined that the scope is different in order to know that
>> mon_scope should be used?)
> 
> I've gone in a different direction. V6 renames "cache_level" to
> "ctrl_scope". I think this makes intent clear from step #1.
>

This change is not clear to me. Previously you changed this name
but kept using it in code specific to cache levels. It is not clear
to me how this time's rename would be different.
 
...

>>> @@ -502,61 +593,19 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_domain *hw_dom)
>>>   */
>>>  static void domain_add_cpu(int cpu, struct rdt_resource *r)
>>>  {
>>> -	int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
>>> -	struct list_head *add_pos = NULL;
>>> -	struct rdt_hw_domain *hw_dom;
>>> -	struct rdt_domain *d;
>>> -	int err;
>>> -
>>> -	d = rdt_find_domain(r, id, &add_pos);
>>> -	if (IS_ERR(d)) {
>>> -		pr_warn("Couldn't find cache id for CPU %d\n", cpu);
>>> -		return;
>>> -	}
>>> -
>>> -	if (d) {
>>> -		cpumask_set_cpu(cpu, &d->cpu_mask);
>>> -		if (r->cache.arch_has_per_cpu_cfg)
>>> -			rdt_domain_reconfigure_cdp(r);
>>> -		return;
>>> -	}
>>> -
>>> -	hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu));
>>> -	if (!hw_dom)
>>> -		return;
>>> -
>>> -	d = &hw_dom->d_resctrl;
>>> -	d->id = id;
>>> -	cpumask_set_cpu(cpu, &d->cpu_mask);
>>> -
>>> -	rdt_domain_reconfigure_cdp(r);
>>> -
>>> -	if (r->alloc_capable && domain_setup_ctrlval(r, d)) {
>>> -		domain_free(hw_dom);
>>> -		return;
>>> -	}
>>> -
>>> -	if (r->mon_capable && arch_domain_mbm_alloc(r->num_rmid, hw_dom)) {
>>> -		domain_free(hw_dom);
>>> -		return;
>>> -	}
>>> -
>>> -	list_add_tail(&d->list, add_pos);
>>> -
>>> -	err = resctrl_online_domain(r, d);
>>> -	if (err) {
>>> -		list_del(&d->list);
>>> -		domain_free(hw_dom);
>>> -	}
>>> +	if (r->alloc_capable)
>>> +		domain_add_cpu_ctrl(cpu, r);
>>> +	if (r->mon_capable)
>>> +		domain_add_cpu_mon(cpu, r);
>>>  }
>>
>> A resource could be both alloc and mon capable ... both
>> domain_add_cpu_ctrl() and domain_add_cpu_mon() can fail.
>> Should domain_add_cpu_mon() still be run for a CPU if
>> domain_add_cpu_ctrl() failed? 
>>
>> Looking ahead the CPU should probably also not be added
>> to the default groups mask if a failure occurred.
> 
> Existing code doesn't do anything for the case where a CPU
> can't be added to a domain (probably the only real error case
> is failure to allocate memory for the domain structure).

Is my statement about CPU being added to default group mask
incorrect? Seems like a potential issue related to domain's
CPU mask also.

Please see my earlier question. Existing code does not proceed
with monitor initialization if control initialization fails and
undoes control initialization if monitor initialization fails. 

> May be something to tackle in a future series if anyone
> thinks this is a serious problem and has suggestions on
> what to do. It seems like a catastrophic problem to not
> have some CPUs in some/all domains of some resources.
> Maybe this should disable mounting resctrl filesystem
> completely?

It is not clear to me how this is catastrophic but I
do not think resctrl should claim support for a resource
on a CPU if it was not able to complete initialization

Reinette