lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab2f780c-72aa-2415-7167-e23c27f9a194@intel.com>
Date:   Mon, 19 Sep 2022 14:07:15 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <bp@...en8.de>
CC:     <fenghua.yu@...el.com>, <dave.hansen@...ux.intel.com>,
        <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
        <akpm@...ux-foundation.org>, <quic_neeraju@...cinc.com>,
        <rdunlap@...radead.org>, <damien.lemoal@...nsource.wdc.com>,
        <songmuchun@...edance.com>, <peterz@...radead.org>,
        <jpoimboe@...nel.org>, <pbonzini@...hat.com>,
        <chang.seok.bae@...el.com>, <pawan.kumar.gupta@...ux.intel.com>,
        <jmattson@...gle.com>, <daniel.sneddon@...ux.intel.com>,
        <sandipan.das@....com>, <tony.luck@...el.com>,
        <james.morse@....com>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <bagasdotme@...il.com>,
        <eranian@...gle.com>
Subject: Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to
 read/write event configuration

Hi Babu,

On 9/19/2022 1:26 PM, Moger, Babu wrote:
> On 9/19/22 11:42, Reinette Chatre wrote:
>> On 9/19/2022 8:46 AM, Moger, Babu wrote:
>>> On 9/16/22 10:58, Reinette Chatre wrote:
>>>> On 9/7/2022 11:01 AM, Babu Moger wrote:
>>>>> Add two new sysfs files to read/write the event configuration if
>>>>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
>>>>> supported. The file mbm_local_config is for the configuration
>>>>> of the event mbm_local_bytes and the file mbm_total_config is
>>>>> for the configuration of mbm_total_bytes.
>>>>>
>>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
>>>>>
>>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
>>>>>
>>>> This patch makes the mbm*config files per monitor group. Looking
>>>> ahead at later patches how the configuration is set it is not clear
>>>> to me that this is the right place for these configuration files.
>>>>
>>>> Looking ahead to patch 10 there is neither rmid nor closid within
>>>> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
>>>> the bits indicating what access types needs to be counted. Also
>>>> in patch 10 I understand that the scope of this register is per L3 cache
>>>> domain.
>>> Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
>>>> Considering this, why is the sysfs file associated with each
>>>> monitor group?
>>> Please see the response below.
>>>> For example, consider the following scenario:
>>>> # cd /sys/fs/resctrl
>>>> # mkdir g2
>>>> # mkdir mon_groups/m1
>>>> # mkdir mon_groups/m2
>>>> # find . | grep mbm_local_config
>>>> ./mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_data/mon_L3_01/mbm_local_config
>>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>>> ./g2/mon_data/mon_L3_01/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
>>>>
>>>>
>>>> From what I understand, the following sysfs files are
>>>> associated with cache domain #0 and thus writing to any of these
>>>> files would change the same configuration:
>>>> ./mon_data/mon_L3_00/mbm_local_config
>>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>>>
>>>> Could you please correct me where I am wrong?
>>> For example, we have CPUs 0-7 in domain 0. We have two counters which are
>>> configurable.
>>>
>>> Lets consider same example as your mentioned about.
>>>
>>> g2 is a control group.
>>>
>>> m1 and m2 are monitor group.
>>>
>>> We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth (or
>>> memory bandwidth with required schemata setting).
>>>
>>> We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.
>>>
>>> We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.
>>>
>>> Each group is independently, monitoring two separate thing. Without having
>> Right, because monitoring, the actual counting of the events, is per monitor
>> group. When a monitor group is created a new RMID is created and when the
>> counter is read it is per-RMID. 
>>
>> The event configuration is independent from the RMID using the counter.
>>
>>> sysfs file (mbm_local_config and mbm_total_config) in each monitor group,
>>> we wont be able to configure the above configuration.
>> I do not understand this reasoning. From what I understand the
>> event configuration is independent from the monitoring group. Thus, changing
>> an event configuration for one monitoring group would impact all
>> monitoring groups using that event counter. This implementation associates
>> an event configuration with each monitoring group and by doing so it
>> implies that it is unique to the monitoring group, but that is not
>> how it works.
> 
> The event configuration is designed per L3 domain. The mon_data is also
> per domain (like mon_L3_00.. mon_L3_01 etc). So, added the event
> configuration file inside each domain. We have all the information inside
> the domain. Thought, that is right place. I am open for suggestions.

It is not clear to me if you are also seeing all the duplication that
accompanies this implementation. As you can see in the example I provided in
https://lore.kernel.org/lkml/13294a8f-e76f-a6a9-284c-67adbc80ec7c@intel.com/,
if I understand the implementation correctly, there will be several
configuration files scattered through resctrl that all configure the same
value. I asked you to correct me where I am wrong but you did not correct me.
Instead you keep repeating that placing the files in the duplicate locations
is convenient. I can see how this is convenient for you but please do consider
that having these duplicate configuration files scattered through resctrl
makes for a very confusing user interface and unexpected behavior. Users
would expect that a configuration associated with a monitor group impacts
that monitor group only - not all monitor groups associated with that
domain.

User API is hard so this does need careful thought. Perhaps the architects
can chime in here.

One option could be:
# cd /sys/fs/resctrl/info/L3_MON
# cat mbm_total_config
0=7f;1=7f
# cat mbm_local_config
0=15;1=15

It would be clear when changing mem_total_config or mbm_local_config that
it would impact all monitoring groups within all resource groups. What do
you think?

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ