[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d9792fa-1d81-a5c3-5c45-2f1f86ee5cc9@amd.com>
Date: Mon, 21 Oct 2024 20:12:53 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
corbet@....net, fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v8 25/25] x86/resctrl: Introduce interface to modify
assignment states of the groups
Hi Reinette,
On 10/21/2024 12:20 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 10/21/24 10:04 AM, Moger, Babu wrote:
>> On 10/15/24 22:43, Reinette Chatre wrote:
>>> On 10/9/24 10:39 AM, Babu Moger wrote:
>
>
>>>> +static int rdtgroup_process_flags(struct rdt_resource *r,
>>>> + enum rdt_group_type rtype,
>>>> + char *p_grp, char *c_grp, char *tok)
>>>> +{
>>>> + int op, mon_state, assign_state, unassign_state;
>>>> + char *dom_str, *id_str, *op_str;
>>>> + struct rdt_mon_domain *d;
>>>> + struct rdtgroup *rdtgrp;
>>>> + unsigned long dom_id;
>>>> + int ret, found = 0;
>>>> +
>>>> + rdtgrp = rdtgroup_find_grp_by_name(rtype, p_grp, c_grp);
>>>> +
>>>> + if (!rdtgrp) {
>>>> + rdt_last_cmd_puts("Not a valid resctrl group\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> +next:
>>>> + if (!tok || tok[0] == '\0')
>>>> + return 0;
>>>> +
>>>> + /* Start processing the strings for each domain */
>>>> + dom_str = strim(strsep(&tok, ";"));
>>>> +
>>>> + op_str = strpbrk(dom_str, "=+-");
>>>> +
>>>> + if (op_str) {
>>>> + op = *op_str;
>>>> + } else {
>>>> + rdt_last_cmd_puts("Missing operation =, +, - character\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + id_str = strsep(&dom_str, "=+-");
>>>> +
>>>> + /* Check for domain id '*' which means all domains */
>>>> + if (id_str && *id_str == '*') {
>>>> + d = NULL;
>>>> + goto check_state;
>>>> + } else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
>>>> + rdt_last_cmd_puts("Missing domain id\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + /* Verify if the dom_id is valid */
>>>> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
>>>> + if (d->hdr.id == dom_id) {
>>>> + found = 1;
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + if (!found) {
>>>> + rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> +check_state:
>>>> + mon_state = rdtgroup_str_to_mon_state(dom_str);
>>>> +
>>>> + if (mon_state == ASSIGN_INVALID) {
>>>> + rdt_last_cmd_puts("Invalid assign flag\n");
>>>> + goto out_fail;
>>>> + }
>>>> +
>>>> + assign_state = 0;
>>>> + unassign_state = 0;
>>>> +
>>>> + switch (op) {
>>>> + case '+':
>>>> + if (mon_state == ASSIGN_NONE) {
>>>> + rdt_last_cmd_puts("Invalid assign opcode\n");
>>>> + goto out_fail;
>>>> + }
>>>> + assign_state = mon_state;
>>>> + break;
>>>> + case '-':
>>>> + if (mon_state == ASSIGN_NONE) {
>>>> + rdt_last_cmd_puts("Invalid assign opcode\n");
>>>> + goto out_fail;
>>>> + }
>>>> + unassign_state = mon_state;
>>>> + break;
>>>> + case '=':
>>>> + assign_state = mon_state;
>>>> + unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
>>>> + break;
>>>> + default:
>>>> + break;
>>>> + }
>>>> +
>>>> + if (unassign_state & ASSIGN_TOTAL) {
>>>> + ret = rdtgroup_unassign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_TOTAL_EVENT_ID);
>>>> + if (ret)
>>>> + goto out_fail;
>>>> + }
>>>> +
>>>> + if (unassign_state & ASSIGN_LOCAL) {
>>>> + ret = rdtgroup_unassign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_LOCAL_EVENT_ID);
>>>> + if (ret)
>>>> + goto out_fail;
>>>> + }
>>>> +
>>>> + if (assign_state & ASSIGN_TOTAL) {
>>>> + ret = rdtgroup_assign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_TOTAL_EVENT_ID);
>>>> + if (ret)
>>>> + goto out_fail;
>>>> + }
>>>> +
>>>> + if (assign_state & ASSIGN_LOCAL) {
>>>> + ret = rdtgroup_assign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_LOCAL_EVENT_ID);
>>>> + if (ret)
>>>> + goto out_fail;
>>>> + }
>>>> +
>>>> + goto next;
>>>> +
>>>> +out_fail:
>>>
>>> Is it possible to print a message to the command status to give some details about which
>>> request failed? I am wondering about a scenario where a user changes multiple domains of
>>> multiple groups, since the operation does not undo changes, it will fail without information
>>> to user space about which setting triggered the failure and which settings succeeded.
>>> This is similar to what is done when user attempts to move several tasks ... the error will
>>> indicate which task triggered failure so that user space knows what completed successfully.
>>
>> Will add something like this on failure.
>>
>> rdt_last_cmd_printf("Total event assign failed on domain %d\n", dom_id);
>
> The user may provide changes for several groups in a single write.
> Could the CTRL_MON and MON group names also be printed? It is not clear
> to me if it will be easier to print the flags the user provides or verbose text
> that the flags translate to, that is "t" vs "Total event".
Yes. We can print generic messages with group names and flags
"Assignment operation +-= failed on resctrl group ABC with flags = lt"
>
>>>> +
>>>> + return -EINVAL;
>>>> +}
>>>> +
>>>> +static ssize_t rdtgroup_mbm_assign_control_write(struct kernfs_open_file *of,
>>>> + char *buf, size_t nbytes, loff_t off)
>>>> +{
>>>> + struct rdt_resource *r = of->kn->parent->priv;
>>>> + char *token, *cmon_grp, *mon_grp;
>>>> + enum rdt_group_type rtype;
>>>> + int ret;
>>>> +
>>>> + /* Valid input requires a trailing newline */
>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>> + return -EINVAL;
>>>> +
>>>> + buf[nbytes - 1] = '\0';
>>>> +
>>>> + cpus_read_lock();
>>>> + mutex_lock(&rdtgroup_mutex);
>>>> +
>>>> + if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>>> + rdt_last_cmd_puts("mbm_cntr_assign mode is not enabled\n");
>>>
>>> Writing to last_cmd_status_buf here ...
>>
>> Sure.
>>
>>>
>>>> + mutex_unlock(&rdtgroup_mutex);
>>>> + cpus_read_unlock();
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + rdt_last_cmd_clear();
>>>
>>> ... but initializing buffer here.
>>> Sidenote: This was an issue before. If you receive comments about
>>> items in patches, please do check if those comments apply to other patches also.
>>
>> Missed it.
>>
>>>
>>>> +
>>>> + while ((token = strsep(&buf, "\n")) != NULL) {
>>>> + if (strstr(token, "/")) {
>
> What is the purpose of this strstr() call?
This is a carry over for v6. Not required. Will remove.
>
>>>> + /*
>>>> + * The write command follows the following format:
>>>> + * “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
>>>> + * Extract the CTRL_MON group.
>>>> + */
>>>> + cmon_grp = strsep(&token, "/");
>>>> +
>>>> + /*
>>>> + * Extract the MON_GROUP.
>>>> + * strsep returns empty string for contiguous delimiters.
>>>> + * Empty mon_grp here means it is a RDTCTRL_GROUP.
>>>> + */
>>>> + mon_grp = strsep(&token, "/");
>>>> +
>>>> + if (*mon_grp == '\0')
>>>> + rtype = RDTCTRL_GROUP;
>>>> + else
>>>> + rtype = RDTMON_GROUP;
>>>> +
>>>> + ret = rdtgroup_process_flags(r, rtype, cmon_grp, mon_grp, token);
>>>> + if (ret)
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + mutex_unlock(&rdtgroup_mutex);
>>>> + cpus_read_unlock();
>>>> +
>>>> + return ret ?: nbytes;
>>>> +}
>>>> +
>>>> #ifdef CONFIG_PROC_CPU_RESCTRL
>>>>
>>>> /*
>>>> @@ -2328,9 +2558,10 @@ static struct rftype res_common_files[] = {
>>>> },
>>>> {
>>>> .name = "mbm_assign_control",
>>>> - .mode = 0444,
>>>> + .mode = 0644,
>>>> .kf_ops = &rdtgroup_kf_single_ops,
>>>> .seq_show = rdtgroup_mbm_assign_control_show,
>>>> + .write = rdtgroup_mbm_assign_control_write,
>>>> },
>>>> {
>>>> .name = "cpus_list",
>>>
>>> On a high level this looks ok but this code needs to be more robust. This will parse
>>> data from user space that may include all kinds of input ... think malicious user or
>>> a buggy script. I am not able to test this code but I tried to work through what will
>>> happen under some wrong input and found some issues. For example, if user space provides
>>> input like '//\n' then rdtgroup_process_flags() will be called with token == NULL. This will
>>> result in rdtgroup_process_flags() returning "success", but fortunately do nothing, for
>>> this invalid input. A more severe example is with input like '//0=\n', from what I can tell
>>> this will result in rdtgroup_str_to_mon_state() called with dom_str==NULL that will treat
>>> this as ASSIGN_NONE and proceed as if user provided '//0=_'.
>>> This was just some scenarios with basic input that could be typos, no real stress tests.
>>> I stopped here though since I believe it is already clear this needs to be more robust.
>>> Please do test this interface by exercising it with invalid input and corner cases.
>>
>> Agree.
>>
>> But, tested the cases you mentioned above. It seems to handle as expected.
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;
>>
>> #echo '//\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> bash: echo: write error: Invalid argument
>>
>> # cat /sys/fs/resctrl/info/last_cmd_status
>> Missing operation =, +, - character
>>
>>
>> #echo '//0=\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> bash: echo: write error: Invalid argument
>>
>> #cat /sys/fs/resctrl/info/last_cmd_status
>> Invalid assign flag
>>
>> #echo '/0=\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> bash: echo: write error: Invalid argument
>> # cat /sys/fs/resctrl/info/last_cmd_status
>> Not a valid resctrl group
>>
>>
>> The assign state did not change.
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;
>>
>> Sure. will test some more combinations to be sure.
>
> hmmm ... these are not quite the examples I shared since from what I can
> tell it adds a second \n that impacts the processing of string.
>
> Could you please try:
> # echo '//' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> and
> # echo '//0=' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
Yes. You are right. Above cases does not work as expected.
This should fix. Will test more.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6095146e3ba4..cccce991d2d0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1044,6 +1044,9 @@ static int rdtgroup_str_to_mon_state(char *flag)
{
int i, mon_state = ASSIGN_NONE;
+ if (!strlen(flag))
+ return ASSIGN_INVALID;
+
for (i = 0; i < strlen(flag); i++) {
switch (*(flag + i)) {
case 't':
Thanks
- Babu Moger
Powered by blists - more mailing lists