lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b725e4ca-8602-eb26-9d47-914526621f52@amd.com>
Date: Fri, 3 May 2024 15:44:18 -0500
From: "Moger, Babu" <bmoger@....com>
To: Peter Newman <peternewman@...gle.com>,
 Reinette Chatre <reinette.chatre@...el.com>
Cc: babu.moger@....com, corbet@....net, fenghua.yu@...el.com,
 tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 paulmck@...nel.org, rdunlap@...radead.org, tj@...nel.org,
 peterz@...radead.org, yanjiewtw@...il.com, kim.phillips@....com,
 lukas.bulwahn@...il.com, seanjc@...gle.com, jmattson@...gle.com,
 leitao@...ian.org, jpoimboe@...nel.org, rick.p.edgecombe@...el.com,
 kirill.shutemov@...ux.intel.com, jithu.joseph@...el.com,
 kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, maciej.wieczor-retman@...el.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, eranian@...gle.com,
 james.morse@....com
Subject: Re: [RFC PATCH v3 00/17] x86/resctrl : Support AMD Assignable
 Bandwidth Monitoring Counters (ABMC)

Hi Peter,

On 5/2/2024 7:57 PM, Peter Newman wrote:
> Hi Reinette,
> 
> On Thu, May 2, 2024 at 4:21 PM Reinette Chatre
> <reinette.chatre@...el.com> wrote:
>>
>> Hi Peter and Babu,
>>
>> On 5/2/2024 1:14 PM, Moger, Babu wrote:
>>> Are you suggesting to enable ABMC by default when available?
>>
>> I do think ABMC should be enabled by default when available and it looks
>> to be what this series aims to do [1]. The way I reason about this is
>> that legacy user space gets more reliable monitoring behavior without
>> needing to change behavior.
> 
> I don't like that for a monitor assignment-aware user, following the
> creation of new monitoring groups, there will be less monitors
> available for assignment. If the user wants precise control over where
> monitors are allocated, they would need to manually unassign the
> automatically-assigned monitor after creating new groups.
> 
> It's an annoyance, but I'm not sure if it would break any realistic
> usage model. Maybe if the monitoring agent operates independently of

Yes. Its annoyance.

But if you think about it, normal users don't create too many groups.
They wont have to worry about assign/unassign headache if we enable 
monitor assignment automatically. Also there is pqos tool which uses 
this interface. It does not have to know about assign/unassign stuff.


> whoever creates monitoring groups it could result in brief periods
> where less monitors than expected are available because whoever just
> created a new monitoring group hasn't given the automatically-assigned
> monitors back yet.
> 
>>
>> I thought there was discussion about communicating to user space
>> when an attempt is made to read data from an event that does not
>> have a counter assigned. Something like below but I did not notice this
>> in this series.
>>
>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> Unassigned
>>
>>>
>>> Then provide the mount option switch back to legacy mode?
>>> I am fine with that if we all agree on that.
>>
>> Why is a mount option needed? I think we should avoid requiring a remount
>> unless required and I would like to understand why it is required here.
>>
>> Peter: could you please elaborate what you mean with it makes it more
>> difficult for the FS code to generically manage monitor assignment?
>>
>> Why would user space be required to recreate all control and monitor
>> groups if wanting to change how memory bandwidth monitoring is done?
> 
> I was looking at this more from the perspective of whether it's
> necessary to support the live transition of the groups' configuration
> back and forth between programming models.  I find it very unlikely
> for the userspace controller software to change its mind about the
> programming model for monitoring in a running system, so I thought
> this would be in the same category as choosing at mount time whether
> or not to use CDP or the MBA software controller.

Good point about the mount option is, we don't create extra files for 
monitor assignment in /sys/fs/resctrl when we mount with legacy option.

> 
> Also, in the software implementation of monitor assignment for older
> AMD processors, which is based on allocating a subset of RMIDs, I'm
> concerned that the context switch handler would want to read the
> monitors associated with the incoming thread's current group to
> determine whether it should use one of the tracked RMIDs. I believe it
> would be cleaner if the lifetime of the generic monitor-tracking
> structures would last until the static branches gating
> __resctrl_sched_in() could be disabled.
> 
>>
>>  From this implementation it has been difficult to understand the impact
>> of switching between ABMC and legacy.
> 
> I'll see if there's a good way to share my software monitor assignment
> prototype so it's clearer how the user interface would interact with
> diverse implementations. Unfortunately, it's difficult to see the
> required abstraction boundaries without the fs/resctrl refactoring
> changes[1] applied. It would also require my changes[2] for reading a
> thread's RMID from the FS structures to prevent monitor assignments
> from forcing an update of all task_structs in the system.
> 
> -Peter
> 
> [1] https://lore.kernel.org/lkml/20240426150537.8094-1-Dave.Martin@arm.com/
> [2] https://lore.kernel.org/lkml/20240325172707.73966-1-peternewman@google.com/
> 

-- 
- Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ