linux-kernel - Re: [RFD] resctrl: reassigning a running container's CTRL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <384cd4c1-8b8f-93f8-1756-6e5ccf1752f5@arm.com>
Date:   Tue, 25 Oct 2022 16:56:01 +0100
From:   James Morse <james.morse@....com>
To:     Peter Newman <peternewman@...gle.com>
Cc:     Reinette Chatre <reinette.chatre@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        "Eranian, Stephane" <eranian@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Babu Moger <Babu.Moger@....com>,
        Gaurang Upasani <gupasani@...gle.com>
Subject: Re: [RFD] resctrl: reassigning a running container's CTRL_MON group

Hi Peter,

On 20/10/2022 11:39, Peter Newman wrote:
> On Wed, Oct 19, 2022 at 3:58 PM James Morse <james.morse@....com> wrote:
>> This isn't how MPAM is designed to be used. You'll hit nasty corners.
>> The big one is the Cache Storage Utilisation counters.
>>
>> See 11.5.2 of the MPAM spec, "MSMON_CFG_CSU_CTL, MPAM Memory System Monitor Configure
>> Cache Storage Usage Monitor Control Register". Not setting the MATCH_PARTID bit has this
>> warning:
>> | If MATCH_PMG is 1 and MATCH_PARTID is 0, it is CONSTRAINED UNPREDICTABLE whether the
>> | monitor instance:
>> | • Measures the storage used with matching PMG and with any PARTID.
>> | • Measures no storage usage, that is, MSMON_CSU.VALUE is zero.
>> | • Measures the storage used with matching PMG and PARTID, that is, treats
>> | MATCH_PARTID as = 1
>>
>> 'constrained unpredictable' is arm's term for "portable software can't rely on this".
>> The folk that designed MPAM don't believe "monitors would only match on PMGs" makes any
>> sense. A PMG is not an RMID. A case in point is the system with only 1 PMG bit.
>>
>> I'm afraid this approach would preclude support for the llc_occupancy counter, and would
>> artificially reduce the number of control groups that can be created as each control group
>> needs an 'RMID'. On the machine with 1 PMG bit - you get 2 control groups, even though it
>> has many more PARTID.
> 
> The first sentence of the Resource Monitoring chapter is also quite an
> obstacle to my challenging to the PARTID-PMG hierarchy:
> 
> | Software environments may be labeled as belonging to a Performance
> | Monitoring Group (PMG) within a partition.
> 
> It seems like the only real issue is that the user is responsible for
> figuring out how best to make use of the available resources. But I seem
> to recall that was the expectation with resctrl, so I should probably
> stop trying to argue for expecting MPAM configurations which resemble
> RDT.
> 
> 
>> On 17/10/2022 11:15, Peter Newman wrote:
>>> Provided that there are sufficient monitor
>>> instances, there would never be any need to reprogram a monitor's
>>> PMG.
>>
>> It sounds like this moves the problem to "make everything a monitor group because only
>> monitor groups can be batch moved".
>>
>> If the tasks file could be moved between control and monitor groups, causing resctrl to
>> relabel the tasks - would that solve more of the problem? (it eliminates the need to make
>> everything a monitor group)
> 
> This was about preserving the RMID and memory bandwidth counts across a
> CLOSID change. If the user is forced to conserve CTRL_MON groups due to
> a limited number of CLOSIDs, keeping the various containers' tasks
> separate is also a concern.

Ah, of course.


> But if there's no need to conserve CTRL_MON groups, then there's no real
> issue.

Yup. I think part of this is exposing the information user-space needs to make the right
decision.

I don't think we should merge 'task group moving' and 'old monitors keep counting', they
each make sense independently.


>> The devil is in the detail, I'm not sure how it serialises with a fork()ing process, I'd
>> hope to do better than relying on the kernel walking the list of processes a lot quicker
>> than user-space can.
> 
> I wasn't planning to do it any more optimally than the rmdir
> implementation today when looking for all tasks impacted by a
> CLOSID/RMID deletion.

Aha - that is the use of for_each_process_thread() which takes the read-lock, instead of
relying on RCU, so it should be safe for processes fork()ing and exit()ing.


Thanks,

James