lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aKzhsOWp2TlFF5mK@agluck-desk3>
Date: Mon, 25 Aug 2025 15:20:32 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: "Chatre, Reinette" <reinette.chatre@...el.com>, Fenghua Yu
	<fenghuay@...dia.com>, "Wieczor-Retman, Maciej"
	<maciej.wieczor-retman@...el.com>, Peter Newman <peternewman@...gle.com>,
	James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>, "Drew
 Fustini" <dfustini@...libre.com>, Dave Martin <Dave.Martin@....com>, "Chen,
 Yu C" <yu.c.chen@...el.com>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "patches@...ts.linux.dev"
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 00/32] x86,fs/resctrl telemetry monitoring

On Fri, Aug 15, 2025 at 03:47:17PM +0000, Luck, Tony wrote:
> > I think this series is close to being ready to pass to the x86 maintainers.
> > To that end I focused a lot on the changelogs with the goal to meet the
> > tip requirements that mostly involved switching to imperative tone. I do not
> > expect that I found all the cases though (and I may also have made some mistakes
> > in my suggested text!) so please ensure the changelogs are in imperative tone
> > and uses consistent terms throughout the series.
> >
> > If you disagree with any feedback or if any of the feedback is unclear please
> > let us discuss before you spin a new version. Of course it is not required
> > that you follow all feedback but if you don't I would like to learn why.
> >
> > Please note that I will be offline next week.
> 
> Reinette,
> 
> I took one fast pass through all your comments. I think they all
> look good (and I believe I understand each one). Thanks for all
> the suggestions.
> 
> I'll try to keep (better) notes on what I change as I work through
> each patch so I'll have a summary of any areas that I'm unsure
> about for you to read when you get back before I post v9.
> 
> Enjoy your time away.

Reinette,

I've completed a longer, slower, pass through making changes to prepare
for v9.  Summary of changes per patch below. I didn't disagree with any
of your suggestions.

The bulk of the changes are small, and localized to each patch. The
exception being removal of pkg_mmio_info[] which dropped patch 18 (which
just counted regions prior to allocation of pkg_mmio_info[]) and patch
19 (which finished filling out the details.

discover_events() is now named enable_events(), since there are almost
no "steps" involved, it doesn't have a header comment. The name now
describes what it does.

Theoretically skip_this_region() might find some regions unusable, while
others in the same pmt_feature_group are OK. To handle this I zero the
telemetry_region::addr so that intel_aet_read_event() can easily skip
"bad" regions.

I've pushed a rdt-aet-v9-wip branch to:
Link: https://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
if you want to take a quick look at the end result before
I patch bomb the list.

-Tony

--- cover ---
INTEL_PMT_DISCOVERY -> INTEL_PMT_TELEMETRY

--- 1 ---
No change

--- 2 ---
No change

--- 3 ---
No change

--- 4 ---
No change

--- 5 ---
"The "type" field provides" -> "rdt_domain_hdr::type provides"
Update final paragraph of commit message as suggested.

--- 6 ---
Code: in domain_add_cpu_mon() change WARN_ON_ONCE(1); in default case to
pr_warn_once("Unknown resource rid=%d\n", r->rid);


--- 7 ---
Code: in domain_remove_cpu_mon() move domain_header_is_valid()to
immediately before container_of() and switch check from r->rid to
hard-coded RDT_RESOURCE_L3.
s/list_del_rcu(&d->hdr.list);/list_del_rcu(&hdr->list);/

"separated" -> "separate"
"can skip" -> "skips"

--- 8 ---
Code: in domain_remove_cpu_ctrl()  move domain_header_is_valid()
to immediately before container_of().
s/list_del_rcu(&d->hdr.list);/list_del_rcu(&hdr->list);/
(moved here from patch 9)

"refactor" -> "refactor domain_remove_cpu_ctrl()"


--- 9 ---
Change second paragraph of commit comment as suggested.

"so the rmid_read::d field is replaced" -> "so replace the L3 specific domain
pointer rmid_read::d with rmid_read::hdr that points to the generic domain
header"

Use imperative tone for last paragraph "Update kerneldoc for mon_data::sum ..."

--- 10 ---
Change rdt_mon_domain to rdt_l3_mon_domain in comment above logical_rmid_to_physical_rmid()

Replace entire commit comment with improved version.

--- 11 ---
"different resources" -> "a different resource"
"these functions" -> "the L3 resource specific functions"
"Two groups of functions renamed here:" -> "Rename three groups of functions:"
Added rdt_get_mon_l3_config() to list of renamed functions to put the "l3"
before the "mon" for consistency.

When changing names for resctrl_mon_resource_{init,exit} add the "l3_" before
"mon" for consistency with other *l3_mon*" naming.

--- 12 ---
"to lower levels" -> "via the mon_data and rmid_read structures to
the functions finally reading the monitoring data."

Replace 3rd paragraph of commit comment with supplied better version.

--- 13 ---
Code: In cpu_on_correct_domain() s/return -EINVAL;/return false;/

--- 14 ---
Code: Move definition of MAX_BINARY_BITS from <linux/resctrl.h> to
fs/resctrl/internal.h.

Be explicit in commit comment that the file system makes the determination
on which events can be displayed in floating point format.

--- 15 ---
"asyncronnous" -> "asynchronous"
"for these drivers" -> "of these drivers"
"are called" -> "completes"
"But expectations" -> "Expectations"
"The call is made with no locks held." -> "resctrl filesystem calls the
hook with no locks held."

--- 16 ---
s/CPU hotplug/CPU hot plug/
Add Reinette's RB tag.

--- 17 ---
Code:
"OOBMSM" -> "INTEL_PMT_TELEMETRY"
"INTEL_PMT_DISCOVERY" -> "INTEL_PMT_TELEMETRY"
re-wrap comment for get_pmt_feature() to use 80 columns.
"OOBMSM discovery" -> "INTEL_PMT_TELEMETRY"
Add the intel_pmt_put_feature_group() calls to intel_aet_exit()
to match the intel_pmt_get_regions_by_feature() calls in get_pmt_feature()
using a new macro for_each_enabled_event_group().
Rename discover_events() to enable_events()

Add period at end of help text in arch/x86/Kconfig.

"Data for telemetry events is collected by each CPU and sent" ->
"Each CPU collects data for telemetry events that it sends"
"is changed" -> "changes"
"or when two milliseconds have elapsed" -> "or when a two millisecond timer expires"
"mad" -> "made"
"Enumeration of support for telemetry events is done" ->
"The INTEL_PMT_TELEMETRY driver enumerates support for telemetry events."
Drop references to INTEL_PMT_DISCOVERY driver.

Merge last two paragraphs of commit message.

Reformat commit to use more of page width.

Add maintainer note about checkpatch complaints for DEFINE_FREE()

--- 18 ---
Dropped. See new patch 0019

--- 19 ---
Dropped. See new patch 0019

--- 20 ---
Now 0018
Rewrite opening paragraph to avoid "continuation of the subject"

Add note/link on the source of the XML files that describe events.

--- 21 ---
Now 0019
Drop "vague first sentence" of second paragraph.

--- NEW 0020 ---
Replaces most of parts 18/32 and 19/32.
Contains skip_this_region() from patch 18/32, but skips all the
code to count regions and allocat pkg_mmio_info[].

Also includes event enabling code from 22/32

--- 22 ---
Now 0021
Modify intel_aet_read_event() to dig into pmt_event::pfg to
find the MMIO base addresses that v8 patch stored in the pkg_mmio_info[]
structures.

--- 23 ---
Now 0022
Add domain_header_is_valid() check in domain_remove_cpu_mon()
before using container_of().

"There are structures" -> "There are per domain structures ..."

Replace my commit fix description with better version from Reinette.

--- 24 ---
Unparseable last sentence of commit message replaced with details
and examples.

--- 25 ---
RMIDS -> RMIDs
"Adjusted downwards ..." -> "May be adjusted downwards ..."
check_rmid_count() -> all_regions_have_sufficient_rmid()
"Potentially disable" -> "Disable"
Add comment:
/* e->num_rmids only adjusted lower if user forces an unusable region to be usable */

--- 26 ---
Add "during resctrl initialization" and "during resctrl exit" to commit
background statement.

Add "closid_num_dirty_rmid[]" to be specific about what is being allocated/freed.

--- 27 ---
"mon capable" -> "mon_capable" (three times)
"alloc capable" -> "alloc_capable"
"rdt_l3_mon_domain::states[]" -> "rdt_l3_mon_domain::mbm_states[] and
+rdt_l3_mon_domain::rmid_busy_llc"

"the number of RMIDs" -> "the system's number of RMIDs"

--- 28 ---
Add comment to setup_rmid_lru_list() to note that rmid_ptrs[]
is allocated of first mount and is reused on subsequent mounts.
It is freed in resctrl_exit().

Lock/unlock rdtgroup_mutex around body of free_rmid_lru_list()

Rewrite commit based on suggestions with some modifications to explain
why error paths in rdt_get_tree() do not call free_rmid_lru_list().

--- 29 ---
TODO: recheck for use of "CPU hot plug notifiers" may have been called "hooks", "callbacks", and
"handlers" through this series.

--- 30 ---
"a resource" -> "a monitoring resource"

--- 31 ---
"last_update_timestamp" -> "agg_last_update_timestamp" in commit comment

Move creation of debugfs files to the end of enable_events().

Rewrite to work based on event_group::pfg since event_group::pkginfo[] is gone.


--- 32 ---
Describe "num_rmids" file values independently for L3 and telemetry.
Move the note about upper bound on directory creation to its own
paragraph to say it is the smaller of reported "num_rmids" values.

"or of a processor package" -> "another for each processor package"

Change paragraph about contents of subdirectories of mon_data to
gize example file names instead of hard coding specifics.

prescence -> presence
will vary -> may vary
last_update_timestamp -> agg_last_update_timestamp

Simplify commit comment as suggested.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ