[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f8073d8c-7dd0-4e8d-a196-183acef13d66@intel.com>
Date: Wed, 18 Dec 2024 14:01:13 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Moger, Babu" <bmoger@....com>, "Luck, Tony" <tony.luck@...el.com>, "Babu
Moger" <babu.moger@....com>
CC: "corbet@....net" <corbet@....net>, "tglx@...utronix.de"
<tglx@...utronix.de>, "mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de"
<bp@...en8.de>, "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"peternewman@...gle.com" <peternewman@...gle.com>, "Yu, Fenghua"
<fenghua.yu@...el.com>, "x86@...nel.org" <x86@...nel.org>, "hpa@...or.com"
<hpa@...or.com>, "paulmck@...nel.org" <paulmck@...nel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "thuth@...hat.com"
<thuth@...hat.com>, "rostedt@...dmis.org" <rostedt@...dmis.org>,
"xiongwei.song@...driver.com" <xiongwei.song@...driver.com>,
"pawan.kumar.gupta@...ux.intel.com" <pawan.kumar.gupta@...ux.intel.com>,
"daniel.sneddon@...ux.intel.com" <daniel.sneddon@...ux.intel.com>,
"jpoimboe@...nel.org" <jpoimboe@...nel.org>, "perry.yuan@....com"
<perry.yuan@....com>, "Huang, Kai" <kai.huang@...el.com>, "Li, Xiaoyao"
<xiaoyao.li@...el.com>, "seanjc@...gle.com" <seanjc@...gle.com>, "Li, Xin3"
<xin3.li@...el.com>, "andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>,
"ebiggers@...gle.com" <ebiggers@...gle.com>, "mario.limonciello@....com"
<mario.limonciello@....com>, "james.morse@....com" <james.morse@....com>,
"tan.shaopeng@...itsu.com" <tan.shaopeng@...itsu.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Wieczor-Retman, Maciej" <maciej.wieczor-retman@...el.com>, "Eranian,
Stephane" <eranian@...gle.com>
Subject: Re: [PATCH v10 16/24] x86/resctrl: Add interface to the assign
counter
On 12/13/24 8:54 AM, Moger, Babu wrote:
> On 12/13/2024 10:24 AM, Luck, Tony wrote:
>>> It is right thing to continue assignment if one of the domain is out of
>>> counters. In that case how about we save the error(say error_domain) and
>>> continue. And finally return success if both ret and error_domain are zeros.
>>>
>>> return ret ? ret : error_domain:
>>
>> If there are many domains, then you might have 3 succeed and 5 fail.
>>
>> I think the best you can do is return success if everything succeeded
>> and an error if any failed.
>
> Yes. The above check should take care of this case.
>
If I understand correctly "error_domain" can capture the ID of
a single failing domain. If there are multiple failing domains like
in Tony's example then "error_domain" will not be accurate and thus
can never be trusted. Instead of a single check of a failure user
space is then forced to parse the more complex "mbm_assign_control"
file to learn what succeeded and failed.
Would it not be simpler to process sequentially and then fail on
first error encountered with detailed error message? With that
user space can determine exactly which portion of request
succeeded and which portion failed.
>>
>> You have the same issue if someone tries to update multiple things
>> with a single write to mbm_assign_control:
>>
>> # cat > mbm_assign_control << EOF
>> c1/m78/0=t;1=l;
>> c1/m79/0=t;1=l
>> c1/m80/0=t;1=l;
>> c1/m81/0=t;1=l;
>> EOF
>>
>> Those get processed in order, some may succeed, but once a domain
>> is out of counters the rest for that domain will fail.
>
> Yes. I see the similar type of processing for schemata.
> It is processed sequentially. If one fails, it returns immediately.
>
> ret = rdtgroup_parse_resource(resname, tok, rdtgrp);
> if (ret)
> goto out;
>
> I feel it is ok to keep same level of processing.
>
resctrl also does sequential processing when, for example, the user requests
move of several tasks. resctrl returns with failure right away with error message
containing failing PID. This gives clear information to user what
portion of request succeeded without requiring user space to
do additional queries.
Reinette
Powered by blists - more mailing lists