[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA1PR11MB6076F9F222A98125974C5CBCFC5C2@IA1PR11MB6076.namprd11.prod.outlook.com>
Date: Thu, 7 Nov 2024 20:58:20 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: "Chatre, Reinette" <reinette.chatre@...el.com>, Peter Newman
<peternewman@...gle.com>
CC: "Yu, Fenghua" <fenghua.yu@...el.com>, "babu.moger@....com"
<babu.moger@....com>, "bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "Eranian,
Stephane" <eranian@...gle.com>, "hpa@...or.com" <hpa@...or.com>,
"james.morse@....com" <james.morse@....com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "mingo@...hat.com" <mingo@...hat.com>,
"nert.pinx@...il.com" <nert.pinx@...il.com>, "tan.shaopeng@...itsu.com"
<tan.shaopeng@...itsu.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"x86@...nel.org" <x86@...nel.org>
Subject: RE: [PATCH v2 2/2] x86/resctrl: Don't workqueue local event counter
reads
> > # cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_01
> > <rdtgroup nameA> <MBM total count> <timestamp> <generation>
> > <rdtgroup nameB> <MBM total count> <timestamp> <generation>
> > ...
> >
> > Where <timestamp> tracks when this sample was captured. And
> > <generation> is an integer that is incremented when data
> > for this event is lost (e.g. due to ABMC counter re-assignment).
Maintaining separate timestamps for each group may be overkill.
The overflow function walks through them all quite rapidly. On
Intel Icelake with 100 groups there is only a 670 usec delta
between the first and last.
> It is not obvious to me how resctrl can provide a reliable
> "generation" value.
Keep a generation count for each event in each group. Increment
the count when taking the h/w counter away.
> > Then a monitor application can compute bandwidth for each
> > group by periodic sampling and for each group:
> >
> > if (thisgeneration == lastgeneration) {
> > bw = (thiscount - lastcount) / (thistimestanp - lasttimestamp);
>
> If user space needs visibility into these internals then we could also
> consider adding a trace event that logs the timestamped data right when it
> is queried by the overflow handler.
That would provide accurate data at low overhead, assuming that
the user wants bandwidth data every second. If they only need
data over longer time intervals all the extra trace events aren't
needed.
-Tony
Powered by blists - more mailing lists