[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d9ec29b7-0efe-b9f1-ad2b-95156f6ccdf0@amd.com>
Date: Thu, 5 Jan 2023 13:59:17 -0600
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de
Cc: fenghua.yu@...el.com, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, paulmck@...nel.org, akpm@...ux-foundation.org,
quic_neeraju@...cinc.com, rdunlap@...radead.org,
damien.lemoal@...nsource.wdc.com, songmuchun@...edance.com,
peterz@...radead.org, jpoimboe@...nel.org, pbonzini@...hat.com,
chang.seok.bae@...el.com, pawan.kumar.gupta@...ux.intel.com,
jmattson@...gle.com, daniel.sneddon@...ux.intel.com,
sandipan.das@....com, tony.luck@...el.com, james.morse@....com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
bagasdotme@...il.com, eranian@...gle.com,
christophe.leroy@...roup.eu, jarkko@...nel.org,
adrian.hunter@...el.com, quic_jiles@...cinc.com,
peternewman@...gle.com
Subject: Re: [PATCH v10 11/13] x86/resctrl: Add interface to write
mbm_total_bytes_config
Hi Reinette,
On 1/5/23 11:49, Reinette Chatre wrote:
> Hi Babu,
>
> On 1/5/2023 8:04 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 1/4/23 18:29, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 12/22/2022 3:31 PM, Babu Moger wrote:
>>>
>>> ...
>>>
>>>> +static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of,
>>>> + char *buf, size_t nbytes,
>>>> + loff_t off)
>>>> +{
>>>> + struct rdt_resource *r = of->kn->parent->priv;
>>>> + int ret;
>>>> +
>>>> + /* Valid input requires a trailing newline */
>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>> + return -EINVAL;
>>>> +
>>>> + cpus_read_lock();
>>> Could you please elaborate why this lock is needed here as
>>> well as in the following patch?
>> Holding the cpus_read_lock() make sure that this cpu is online while doing
>> this operation. This code eventually sends an IPI to write the MSR on one
>> of the CPUs using the cpumasks. My understanding is to make sure cpumask
>> is stable while handling this write. Same thing is done in
> This flow uses smp_call_function_any() to send the IPI and update the MSR.
> smp_call_function_any() itself disables preemption to protect against
> CPUs going offline while attempting the update.
>
> The domain's cpumask itself cannot change during this flow because rdtgroup_mutex
> is held the entire time. This mutex is needed by the resctrl CPU online/offline
> callbacks that may update the mask.
Ok
>
>> rdtgroup_schemata_write.
> Yes, rdtgroup_schemata_write uses this but please take a look at _why_ it
> is using it. This was something added later as part of the pseudo-locking
> code. Please see the commit message for the details that explain the usage:
> 80b71c340f17 ("x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence")
Ok. That make sense.
>
> Could you please provide more detail if you still find that this lock is needed?
> If you prefer to refer to existing code flows there are other examples
> in resctrl where the domain's CPU mask is used to read/write registers without the
> hotplug lock that you can use for reference:
> * Even in this patch series itself, reading of the config.
> * When creating a new resource group (the mkdir flow) the MSRs are written with an
> initial config without hotplug lock.
> * When writing to the tasks file the CPU on which task may be running receives IPI
> without hotplug lock held the entire time.
> * See resctrl flow of monitoring data reads.
>
> Alternatively you may want to take a closer look at where the hotplug lock _is_ held in
> resctrl to consider if those usages match this work. Understanding
> why the hotplug lock is currently used should be clear with the commits associated
> with their introduction because there has been a few bugs surrounding this.
This is overlook from my side looking at the code in rdtgroup_schemata_write.
I was concerned about domains cpu_mask being changed while doing the IPI.
Looks like that is safe with rdtgroup_mutex.
I will remove the cpus_read_lock() from here and following patch. Will run
few tests. Will send the updated patches soon.
Thanks for the explanation.
Thanks
Babu
Powered by blists - more mailing lists