linux-kernel - Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d9775a3-8b5d-d30b-c473-b5364a4111fa@amd.com>
Date:   Mon, 20 Mar 2023 10:15:40 -0500
From:   "Moger, Babu" <babu.moger@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>,
        "corbet@....net" <corbet@....net>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>
Cc:     "fenghua.yu@...el.com" <fenghua.yu@...el.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
        "paulmck@...nel.org" <paulmck@...nel.org>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "quic_neeraju@...cinc.com" <quic_neeraju@...cinc.com>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        "damien.lemoal@...nsource.wdc.com" <damien.lemoal@...nsource.wdc.com>,
        "songmuchun@...edance.com" <songmuchun@...edance.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "jpoimboe@...nel.org" <jpoimboe@...nel.org>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "chang.seok.bae@...el.com" <chang.seok.bae@...el.com>,
        "pawan.kumar.gupta@...ux.intel.com" 
        <pawan.kumar.gupta@...ux.intel.com>,
        "jmattson@...gle.com" <jmattson@...gle.com>,
        "daniel.sneddon@...ux.intel.com" <daniel.sneddon@...ux.intel.com>,
        "Das1, Sandipan" <Sandipan.Das@....com>,
        "tony.luck@...el.com" <tony.luck@...el.com>,
        "james.morse@....com" <james.morse@....com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "bagasdotme@...il.com" <bagasdotme@...il.com>,
        "eranian@...gle.com" <eranian@...gle.com>,
        "christophe.leroy@...roup.eu" <christophe.leroy@...roup.eu>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "adrian.hunter@...el.com" <adrian.hunter@...el.com>,
        "quic_jiles@...cinc.com" <quic_jiles@...cinc.com>,
        "peternewman@...gle.com" <peternewman@...gle.com>
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl
 group at once



On 3/20/23 10:07, Moger, Babu wrote:
> Hi Reinette,
> 
> On 3/16/23 15:33, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/16/2023 12:51 PM, Moger, Babu wrote:
>>> On 3/16/23 12:12, Reinette Chatre wrote:
>>>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>>>> -----Original Message-----
>>>>>> From: Reinette Chatre <reinette.chatre@...el.com>
>>>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>>>> To: Moger, Babu <Babu.Moger@....com>; corbet@....net;
>>>>>> tglx@...utronix.de; mingo@...hat.com; bp@...en8.de
>>>>>> Cc: fenghua.yu@...el.com; dave.hansen@...ux.intel.com; x86@...nel.org;
>>>>>> hpa@...or.com; paulmck@...nel.org; akpm@...ux-foundation.org;
>>>>>> quic_neeraju@...cinc.com; rdunlap@...radead.org;
>>>>>> damien.lemoal@...nsource.wdc.com; songmuchun@...edance.com;
>>>>>> peterz@...radead.org; jpoimboe@...nel.org; pbonzini@...hat.com;
>>>>>> chang.seok.bae@...el.com; pawan.kumar.gupta@...ux.intel.com;
>>>>>> jmattson@...gle.com; daniel.sneddon@...ux.intel.com; Das1, Sandipan
>>>>>> <Sandipan.Das@....com>; tony.luck@...el.com; james.morse@....com;
>>>>>> linux-doc@...r.kernel.org; linux-kernel@...r.kernel.org;
>>>>>> bagasdotme@...il.com; eranian@...gle.com; christophe.leroy@...roup.eu;
>>>>>> jarkko@...nel.org; adrian.hunter@...el.com; quic_jiles@...cinc.com;
>>>>>> peternewman@...gle.com
>>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>>>> at once
>>>>>>
>>>>>> Hi Babu,
>>>>>>
>>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>>>> done one at a time. For example:
>>>>>>>
>>>>>>>   $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>>>   $mkdir /sys/fs/resctrl/clos1
>>>>>>>   $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>>>>   $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>>>>   $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>>>
>>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>>>> there is a syscall overhead for each command executed from user space.
>>>>>>
>>>>>> To support this change it may also be helpful to add that moving tasks take the
>>>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>>>> performance gain.
>>>>>
>>>>> Agree. It may not be significant performance gain.  Will remove this line. 
>>>>
>>>> It does not sound as though you are actually responding to my comment.
>>>
>>> I am confused. I am already saying there is syscall overhead for each
>>> command if we move the tasks one by one. Now do you want me to add "moving
>>> tasks take the mutex so attempting to move tasks in parallel will not
>>> achieve a significant performance gain".
>>>
>>> It is contradictory, So, I wanted to remove the line about performance.
>>> Did I still miss something?
>>
>> Where is the contradiction?
>>
>> Consider your example:
>>    $echo 123 > /sys/fs/resctrl/clos1/tasks
>>    $echo 456 > /sys/fs/resctrl/clos1/tasks
>>    $echo 789 > /sys/fs/resctrl/clos1/tasks
>>
>> Yes, there is syscall overhead for each of the above lines. My statement was in
>> support of this work by stating that a user aiming to improve performance by
>> attempting the above in parallel would not be able to see achieve significant
>> performance gain since the calls would end up being serialized.
> 
> ok. Sure. Will add the text. I may modify little bit.
>>
>> You are providing two motivations (a) "user-friendly when dealing with
>> hundreds of tasks", and (b) syscall overhead. Have you measured the
>> improvement this solution provides?
> 
> No. I have not measured the performance improvement.
> 
>>
>>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>>>>>> kernfs_open_file *of,
>>>>>>>  				    char *buf, size_t nbytes, loff_t off)  {
>>>>>>>  	struct rdtgroup *rdtgrp;
>>>>>>> +	char *pid_str;
>>>>>>>  	int ret = 0;
>>>>>>>  	pid_t pid;
>>>>>>>
>>>>>>> -	if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>>>>>> +	/* Valid input requires a trailing newline */
>>>>>>> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>>>>>  		return -EINVAL;
>>>>>>
>>>>>> The resctrl files should be seen as user space API. With the above change you
>>>>>> take an interface that did not require a newline and dictate that it should have
>>>>>> a trailing newline. How convinced are you that this does not break any current
>>>>>> user space scripts or applications? Why does this feature require a trailing
>>>>>> newline?
>>>>>
>>>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. 
>>>>> We are already doing newline check for few other inputs.
>>>>
>>>> You tested this with the _one_ user space tool that you use. This is not sufficient
>>>> to be convincing that this change has no impact. I do not believe that it is a valid
>>>> argument that other inputs do a newline check. This input never required a newline
>>>> check and it is not clear why this change now requires it. It seems that this is an
>>>> unnecessary new requirement that runs the risk of breaking an existing application.
>>>>
>>>> I would like to ask again: How convinced are you that this does not break _any_ current
>>>> user space scripts or applications? Why does this feature require a trailing
>>>> newline?
> 
> I dont think this feature required trailing newline. I may have carried
> away from similar code in the area. I will remove that requirement.
> 
>>>
>>> I do not know of any other tool using resctrl fs.
>>> So, you want me to drop the newline requirement for this. I can try that.
>>> Will let you know how it goes.
>>
>> You continue to avoid my question about why this requires a newline. Until
>> I learn why this is required, yes, from what I can tell based on this patch 
>> this requirement can and should be dropped.
>>
>>>>>>> +
>>>>>>> +	buf[nbytes - 1] = '\0';
>>>>>>> +
>>>>>>>  	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>>>>  	if (!rdtgrp) {
>>>>>>>  		rdtgroup_kn_unlock(of->kn);
>>>>>>>  		return -ENOENT;
>>>>>>>  	}
>>>>>>> +
>>>>>>> +next:
>>>>>>> +	if (!buf || buf[0] == '\0')
>>>>>>> +		goto unlock;
>>>>>>> +
>>>>>>> +	pid_str = strim(strsep(&buf, ","));
>>>>>>> +
>>>>>>
>>>>>> Could lib/cmdline.c:get_option() be useful?
>>>>>
>>>>> Yes. We could that also. May not be required for the simple case like this.
>>>>
>>>> Please keep an eye out for how much of it you end up duplicating ....
>>>
>>> Using the get_options will require at least two calls(one to get the
>>> length and then read the integers). Also need to allocate the integers
>>> array dynamically. That is lot code if we are going that route.
>>>
>>
>> I did not ask about get_options(), I asked about get_option().
> 
> If you insist, will use get_option. But we still have to loop thru all the
> string till get_option returns 0. I can try that.
> 
>>
>>>>
>>>>>>> +		ret = -EINVAL;
>>>>>>> +		goto unlock;
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	rdt_last_cmd_clear();
>>>>>>>
>>>>>>>  	if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>>>>>> +721,10 @@
>>>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>>>>>>  	}
>>>>>>>
>>>>>>>  	ret = rdtgroup_move_task(pid, rdtgrp, of);
>>>>>>> +	if (ret)
>>>>>>> +		goto unlock;
>>>>>>> +	else
>>>>>>> +		goto next;
>>>>>>>
>>>>>>
>>>>>> The documentation states "The failure details will be logged in
>>>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>>>>>> From what I can tell this implementation does not do anything beyond what
>>>>>> last_cmd_status already does so any special mention in the docs is not clear to
>>>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>>>>>> sounded as though last_cmd_status would contain the error with the pid that
>>>>>> encountered the error but I do not see this happening here.
>>>>>
>>>>> You are right we are not doing anything special here. pid failures error was already there.
>>>>> I will have to change the text here.
>>>>
>>>> What do you mean with "pid failures error was already there"? From what
>>>> I understand your goal is to communicate to the user which pid
>>>> encountered the error and I do not see that done. How will user know
>>>> which pid encountered a failure?
>>>
>>> We only have couple of failures to take here. Those failures are already
>>> handled by rdtgroup_move_task. It logs the pid for failure(using
>>> rdt_last_cmd_printf).
>>
>> The pid is only logged if there is no task with that pid. How about the
>> error in __rdtgroup_move_task() - how will the user know which pid triggered
>> that error?
> 
> Yes. These cases we may be able to report the pid.

I meant "we may not"
> 
>>
>>>
>>> I can say "The failure pid will be logged in
>>> /sys/fs/resctrl/info/last_cmd_status file."
>>
>> That will not be accurate. Not all errors include the pid.
> 
> Can you please suggest?
> Thanks
> Babu Moger

-- 
Thanks
Babu Moger