[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQOKJX13sIWhDVJ4@agluck-desk3>
Date: Thu, 30 Oct 2025 08:54:13 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: "Chen, Yu C" <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>, Reinette Chatre <reinette.chatre@...el.com>,
	"James Morse" <james.morse@....com>, Fenghua Yu <fenghuay@...dia.com>, Dave
 Martin <Dave.Martin@....com>, Peter Newman <peternewman@...gle.com>, Babu
 Moger <babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, "Maciej
 Wieczor-Retman" <maciej.wieczor-retman@...el.com>
Subject: Re: [PATCH v13 11/32] x86,fs/resctrl: Handle events that can be read
 from any CPU
On Thu, Oct 30, 2025 at 02:14:27PM +0800, Chen, Yu C wrote:
> Hi Tony,
> 
> On 10/30/2025 12:20 AM, Tony Luck wrote:
> > resctrl assumes that monitor events can only be read from a CPU in the
> > cpumask_t set of each domain.  This is true for x86 events accessed
> > with an MSR interface, but may not be true for other access methods such
> > as MMIO.
> > 
> > Introduce and use flag mon_evt::any_cpu, settable by architecture, that
> > indicates there are no restrictions on which CPU can read that event.
> > 
> > Signed-off-by: Tony Luck <tony.luck@...el.com>
> 
> [snip]
> 
> > -void resctrl_enable_mon_event(enum resctrl_event_id eventid)
> > +void resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu)
> >   {
> >   	if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS))
> >   		return;
> > @@ -984,6 +984,7 @@ void resctrl_enable_mon_event(enum resctrl_event_id eventid)
> >   		return;
> >   	}
> > +	mon_event_all[eventid].any_cpu = any_cpu;
> >   	mon_event_all[eventid].enabled = true;
> >   }
> 
> It seems that cpu_on_correct_domain() was dropped, due to
> the refactor of __mon_event_count() in patch 0006 means it is no
> longer needed.  But we still invoke smp_processor_id() in preemptible
> context in __l3_mon_event_count() before further checkings, which would
> cause a warning.
> [ 4266.361951] BUG: using smp_processor_id() in preemptible [00000000] code:
> grep/1603
> [ 4266.363231] caller is __l3_mon_event_count+0x30/0x2a0
> [ 4266.364250] Call Trace:
> [ 4266.364262]  <TASK>
> [ 4266.364273]  dump_stack_lvl+0x53/0x70
> [ 4266.364289]  check_preemption_disabled+0xca/0xe0
> [ 4266.364303]  __l3_mon_event_count+0x30/0x2a0
> [ 4266.364320]  mon_event_count+0x22/0x90
> [ 4266.364334]  rdtgroup_mondata_show+0x108/0x390
> [ 4266.364353]  seq_read_iter+0x10d/0x450
> [ 4266.364368]  vfs_read+0x215/0x330
> [ 4266.364386]  ksys_read+0x6b/0xe0
> [ 4266.364401]  do_syscall_64+0x57/0xd70
I didn't notice this in my testing. Is this in your region aware
tree? If you are still using RDT_RESOURCE_L3 then I can see how
you got this call trace.
Maybe you need to dig cpu_on_correct_domain() back up and apply
it to __l3_mon_event_count()?
-Tony
Powered by blists - more mailing lists
 
