lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCj7FBv_vfDp+4tgqo4p8T7Eov_Ys+CQRoAX6u43a4OTDQ@mail.gmail.com>
Date: Mon, 26 May 2025 15:14:08 +0200
From: Peter Newman <peternewman@...gle.com>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: "Chatre, Reinette" <reinette.chatre@...el.com>, Babu Moger <babu.moger@....com>, 
	"corbet@....net" <corbet@....net>, "tglx@...utronix.de" <tglx@...utronix.de>, 
	"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>, 
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "james.morse@....com" <james.morse@....com>, 
	"dave.martin@....com" <dave.martin@....com>, "fenghuay@...dia.com" <fenghuay@...dia.com>, 
	"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>, 
	"paulmck@...nel.org" <paulmck@...nel.org>, 
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "thuth@...hat.com" <thuth@...hat.com>, 
	"rostedt@...dmis.org" <rostedt@...dmis.org>, "ardb@...nel.org" <ardb@...nel.org>, 
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>, 
	"daniel.sneddon@...ux.intel.com" <daniel.sneddon@...ux.intel.com>, 
	"jpoimboe@...nel.org" <jpoimboe@...nel.org>, 
	"alexandre.chartre@...cle.com" <alexandre.chartre@...cle.com>, 
	"pawan.kumar.gupta@...ux.intel.com" <pawan.kumar.gupta@...ux.intel.com>, 
	"thomas.lendacky@....com" <thomas.lendacky@....com>, "perry.yuan@....com" <perry.yuan@....com>, 
	"seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai" <kai.huang@...el.com>, 
	"Li, Xiaoyao" <xiaoyao.li@...el.com>, 
	"kan.liang@...ux.intel.com" <kan.liang@...ux.intel.com>, "Li, Xin3" <xin3.li@...el.com>, 
	"ebiggers@...gle.com" <ebiggers@...gle.com>, "xin@...or.com" <xin@...or.com>, 
	"Mehta, Sohil" <sohil.mehta@...el.com>, 
	"andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>, 
	"mario.limonciello@....com" <mario.limonciello@....com>, 
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, 
	"Wieczor-Retman, Maciej" <maciej.wieczor-retman@...el.com>, "Eranian, Stephane" <eranian@...gle.com>, 
	"Xiaojian.Du@....com" <Xiaojian.Du@....com>, "gautham.shenoy@....com" <gautham.shenoy@....com>
Subject: Re: [PATCH v13 11/27] x86/resctrl: Implement resctrl_arch_config_cntr()
 to assign a counter with ABMC

Hi Tony,

On Fri, May 23, 2025 at 11:08 PM Luck, Tony <tony.luck@...el.com> wrote:
>
> On Thu, May 22, 2025 at 10:16:16PM +0000, Luck, Tony wrote:
> > > It looks to me as though there are a couple of changes in the telemetry work
> > > that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/
> > > switches the monitor events to be maintained in an array indexed by event ID, eliminating the
> > > need for searching the evt_list that this work does in a couple of places. Also note the handy
> > > new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/).
> >
> > Yesterday I ran through the exercise of rebasing my AET patches on top of these
> > ABMC patches in order to check whether the ABMC patches painted resctrl
> > into some corner that would be hard to get back out of.
> >
> > Good news: they don't.
> >
> > There was a bunch of manual patching to make the first four patches fit on top
> > of the ABMC code, but I also noticed a few places where things were simpler
> > after combining the two series.
> >
> > Maybe a good path forward would be to take those first four patches from
> > my AET series and then build ABMC on top of those.
>
> As an encouragement to try this direction, I took my four patches
> on top of tip x86/cache and then applied Babu's ABMC series.

I did the same thing last week, except in the other order, so I
switched to your changes to test.

>
> Changes to Babu's code:
> 1) Adapt where needed for removal of evt_list. Use event array instead.
> 2) Use for_each_mbm_event() [Maybe didn't get all places?]
> 3) Bring the s/evt_val/evt_cfg/ fix into patch 20 from 21
> 4) Fix fir tree declaration for resctrl_process_assign()
>
> I don't have an AMD system to check if the ABMC parts still work. But
> it does pass the resctrl self tests, so legacy isn't broken.
>
> Patches in the "my_mbm_plus_babu_abmc" branch of my kernel.org
> repo: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git

Thanks for applying my suggestion[1] about the array entry sizes, but
you needed one more dereference:

diff --git a/arch/x86/kernel/cpu/resctrl/core.c
b/arch/x86/kernel/cpu/resctrl/core.c
index 1db6a61e27746..0c27e0a5a7b96 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -399,7 +399,7 @@ static int domain_setup_ctrlval(struct
rdt_resource *r, struct rdt_ctrl_domain *
  */
 static int arch_domain_mbm_alloc(u32 num_rmid, struct
rdt_hw_mon_domain *hw_dom)
 {
-       size_t tsize = sizeof(hw_dom->arch_mbm_states[0]);
+       size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]);
        enum resctrl_event_id evt;
        int idx;

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 098ff002d2232..44ec33cb165f7 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -4819,7 +4823,7 @@ void resctrl_offline_mon_domain(struct
rdt_resource *r, struct rdt_mon_domain *d
 static int domain_setup_mon_state(struct rdt_resource *r, struct
rdt_mon_domain *d)
 {
        u32 idx_limit = resctrl_arch_system_num_rmid_idx();
-       size_t tsize = sizeof(d->mbm_states[0]);
+       size_t tsize = sizeof(*d->mbm_states[0]);
        enum resctrl_event_id evt;
        int idx;


You should be able to repro an array overrun without ABMC, and a page
fault is likely if the system implements a lot of RMIDs. The AMD EPYC
9B45 I tested on implements 4096 RMIDs.

Thanks,
-Peter


[1] https://lore.kernel.org/lkml/CALPaoCj8yfzJ=5CkxTPQXc0-WRWpu0xKRX8v4FAWFGQKtXtMUw@mail.gmail.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ