lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f3ab90b-0f35-4dcc-9996-4d6e941cbf2e@intel.com>
Date: Mon, 21 Oct 2024 10:20:09 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
	<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
	<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
	<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
	<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
	<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
	<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
	<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
	<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
	<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [PATCH v8 25/25] x86/resctrl: Introduce interface to modify
 assignment states of the groups

Hi Babu,

On 10/21/24 10:04 AM, Moger, Babu wrote:
> On 10/15/24 22:43, Reinette Chatre wrote:
>> On 10/9/24 10:39 AM, Babu Moger wrote:


>>> +static int rdtgroup_process_flags(struct rdt_resource *r,
>>> +				  enum rdt_group_type rtype,
>>> +				  char *p_grp, char *c_grp, char *tok)
>>> +{
>>> +	int op, mon_state, assign_state, unassign_state;
>>> +	char *dom_str, *id_str, *op_str;
>>> +	struct rdt_mon_domain *d;
>>> +	struct rdtgroup *rdtgrp;
>>> +	unsigned long dom_id;
>>> +	int ret, found = 0;
>>> +
>>> +	rdtgrp = rdtgroup_find_grp_by_name(rtype, p_grp, c_grp);
>>> +
>>> +	if (!rdtgrp) {
>>> +		rdt_last_cmd_puts("Not a valid resctrl group\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +next:
>>> +	if (!tok || tok[0] == '\0')
>>> +		return 0;
>>> +
>>> +	/* Start processing the strings for each domain */
>>> +	dom_str = strim(strsep(&tok, ";"));
>>> +
>>> +	op_str = strpbrk(dom_str, "=+-");
>>> +
>>> +	if (op_str) {
>>> +		op = *op_str;
>>> +	} else {
>>> +		rdt_last_cmd_puts("Missing operation =, +, - character\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	id_str = strsep(&dom_str, "=+-");
>>> +
>>> +	/* Check for domain id '*' which means all domains */
>>> +	if (id_str && *id_str == '*') {
>>> +		d = NULL;
>>> +		goto check_state;
>>> +	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
>>> +		rdt_last_cmd_puts("Missing domain id\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	/* Verify if the dom_id is valid */
>>> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
>>> +		if (d->hdr.id == dom_id) {
>>> +			found = 1;
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	if (!found) {
>>> +		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +check_state:
>>> +	mon_state = rdtgroup_str_to_mon_state(dom_str);
>>> +
>>> +	if (mon_state == ASSIGN_INVALID) {
>>> +		rdt_last_cmd_puts("Invalid assign flag\n");
>>> +		goto out_fail;
>>> +	}
>>> +
>>> +	assign_state = 0;
>>> +	unassign_state = 0;
>>> +
>>> +	switch (op) {
>>> +	case '+':
>>> +		if (mon_state == ASSIGN_NONE) {
>>> +			rdt_last_cmd_puts("Invalid assign opcode\n");
>>> +			goto out_fail;
>>> +		}
>>> +		assign_state = mon_state;
>>> +		break;
>>> +	case '-':
>>> +		if (mon_state == ASSIGN_NONE) {
>>> +			rdt_last_cmd_puts("Invalid assign opcode\n");
>>> +			goto out_fail;
>>> +		}
>>> +		unassign_state = mon_state;
>>> +		break;
>>> +	case '=':
>>> +		assign_state = mon_state;
>>> +		unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
>>> +		break;
>>> +	default:
>>> +		break;
>>> +	}
>>> +
>>> +	if (unassign_state & ASSIGN_TOTAL) {
>>> +		ret = rdtgroup_unassign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +		if (ret)
>>> +			goto out_fail;
>>> +	}
>>> +
>>> +	if (unassign_state & ASSIGN_LOCAL) {
>>> +		ret = rdtgroup_unassign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +		if (ret)
>>> +			goto out_fail;
>>> +	}
>>> +
>>> +	if (assign_state & ASSIGN_TOTAL) {
>>> +		ret = rdtgroup_assign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +		if (ret)
>>> +			goto out_fail;
>>> +	}
>>> +
>>> +	if (assign_state & ASSIGN_LOCAL) {
>>> +		ret = rdtgroup_assign_cntr_event(r, rdtgrp, d, QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +		if (ret)
>>> +			goto out_fail;
>>> +	}
>>> +
>>> +	goto next;
>>> +
>>> +out_fail:
>>
>> Is it possible to print a message to the command status to give some details about which
>> request failed? I am wondering about a scenario where a user changes multiple domains of
>> multiple groups, since the operation does not undo changes, it will fail without information
>> to user space about which setting triggered the failure and which settings succeeded.
>> This is similar to what is done when user attempts to move several tasks ... the error will
>> indicate which task triggered failure so that user space knows what completed successfully.
> 
> Will add something like this on failure.
> 
> rdt_last_cmd_printf("Total event assign failed on domain %d\n", dom_id);

The user may provide changes for several groups in a single write.
Could the CTRL_MON and MON group names also be printed? It is not clear
to me if it will be easier to print the flags the user provides or verbose text
that the flags translate to, that is "t" vs "Total event".

>>> +
>>> +	return -EINVAL;
>>> +}
>>> +
>>> +static ssize_t rdtgroup_mbm_assign_control_write(struct kernfs_open_file *of,
>>> +						 char *buf, size_t nbytes, loff_t off)
>>> +{
>>> +	struct rdt_resource *r = of->kn->parent->priv;
>>> +	char *token, *cmon_grp, *mon_grp;
>>> +	enum rdt_group_type rtype;
>>> +	int ret;
>>> +
>>> +	/* Valid input requires a trailing newline */
>>> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>> +		return -EINVAL;
>>> +
>>> +	buf[nbytes - 1] = '\0';
>>> +
>>> +	cpus_read_lock();
>>> +	mutex_lock(&rdtgroup_mutex);
>>> +
>>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>> +		rdt_last_cmd_puts("mbm_cntr_assign mode is not enabled\n");
>>
>> Writing to last_cmd_status_buf here ...
> 
> Sure.
> 
>>
>>> +		mutex_unlock(&rdtgroup_mutex);
>>> +		cpus_read_unlock();
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	rdt_last_cmd_clear();
>>
>> ... but initializing buffer here. 
>> Sidenote: This was an issue before. If you receive comments about
>> items in patches, please do check if those comments apply to other patches also.
> 
> Missed it.
> 
>>
>>> +
>>> +	while ((token = strsep(&buf, "\n")) != NULL) {
>>> +		if (strstr(token, "/")) {

What is the purpose of this strstr() call?

>>> +			/*
>>> +			 * The write command follows the following format:
>>> +			 * “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
>>> +			 * Extract the CTRL_MON group.
>>> +			 */
>>> +			cmon_grp = strsep(&token, "/");
>>> +
>>> +			/*
>>> +			 * Extract the MON_GROUP.
>>> +			 * strsep returns empty string for contiguous delimiters.
>>> +			 * Empty mon_grp here means it is a RDTCTRL_GROUP.
>>> +			 */
>>> +			mon_grp = strsep(&token, "/");
>>> +
>>> +			if (*mon_grp == '\0')
>>> +				rtype = RDTCTRL_GROUP;
>>> +			else
>>> +				rtype = RDTMON_GROUP;
>>> +
>>> +			ret = rdtgroup_process_flags(r, rtype, cmon_grp, mon_grp, token);
>>> +			if (ret)
>>> +				break;
>>> +		}
>>> +	}
>>> +
>>> +	mutex_unlock(&rdtgroup_mutex);
>>> +	cpus_read_unlock();
>>> +
>>> +	return ret ?: nbytes;
>>> +}
>>> +
>>>  #ifdef CONFIG_PROC_CPU_RESCTRL
>>>  
>>>  /*
>>> @@ -2328,9 +2558,10 @@ static struct rftype res_common_files[] = {
>>>  	},
>>>  	{
>>>  		.name		= "mbm_assign_control",
>>> -		.mode		= 0444,
>>> +		.mode		= 0644,
>>>  		.kf_ops		= &rdtgroup_kf_single_ops,
>>>  		.seq_show	= rdtgroup_mbm_assign_control_show,
>>> +		.write		= rdtgroup_mbm_assign_control_write,
>>>  	},
>>>  	{
>>>  		.name		= "cpus_list",
>>
>> On a high level this looks ok but this code needs to be more robust. This will parse
>> data from user space that may include all kinds of input ... think malicious user or
>> a buggy script. I am not able to test this code but I tried to work through what will
>> happen under some wrong input and found some issues. For example, if user space provides
>> input like '//\n' then rdtgroup_process_flags() will be called with token == NULL. This will
>> result in rdtgroup_process_flags() returning "success", but fortunately do nothing, for
>> this invalid input. A more severe example is with input like '//0=\n', from what I can tell
>> this will result in rdtgroup_str_to_mon_state() called with dom_str==NULL that will treat
>> this as ASSIGN_NONE and proceed as if user provided '//0=_'.
>> This was just some scenarios with basic input that could be typos, no real stress tests.
>> I stopped here though since I believe it is already clear this needs to be more robust.
>> Please do test this interface by exercising it with invalid input and corner cases.
> 
> Agree.
> 
> But, tested the cases you mentioned above. It seems to handle as expected.
> 
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;
> 
> #echo '//\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> bash: echo: write error: Invalid argument
> 
> # cat /sys/fs/resctrl/info/last_cmd_status
> Missing operation =, +, - character
> 
> 
> #echo '//0=\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> bash: echo: write error: Invalid argument
> 
> #cat /sys/fs/resctrl/info/last_cmd_status
> Invalid assign flag
> 
> #echo '/0=\n' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> bash: echo: write error: Invalid argument
> # cat /sys/fs/resctrl/info/last_cmd_status
> Not a valid resctrl group
> 
> 
> The assign state did not change.
> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;
> 
> Sure. will test some more combinations to be sure.

hmmm ... these are not quite the examples I shared since from what I can
tell it adds a second \n that impacts the processing of string.

Could you please try:
# echo '//' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
and 
# echo '//0=' > /sys/fs/resctrl/info/L3_MON/mbm_assign_control

Reinette



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ