[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c3ca6d66-e58c-8ace-e88e-45ded5de836f@arm.com>
Date: Mon, 6 Mar 2023 11:33:54 +0000
From: James Morse <james.morse@....com>
To: Peter Newman <peternewman@...gle.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Babu Moger <Babu.Moger@....com>,
shameerali.kolothum.thodi@...wei.com,
D Scott Phillips OS <scott@...amperecomputing.com>,
carl@...amperecomputing.com, lcherian@...vell.com,
bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
xingxin.hx@...nanolis.org, baolin.wang@...ux.alibaba.com,
Jamie Iles <quic_jiles@...cinc.com>,
Xin Hao <xhao@...ux.alibaba.com>
Subject: Re: [PATCH v2 09/18] x86/resctrl: Allow resctrl_arch_rmid_read() to
sleep
Hi Peter,
On 23/01/2023 15:33, Peter Newman wrote:
> On Fri, Jan 13, 2023 at 6:56 PM James Morse <james.morse@....com> wrote:
>> MPAM's cache occupancy counters can take a little while to settle once
>> the monitor has been configured. The maximum settling time is described
>> to the driver via a firmware table. The value could be large enough
>> that it makes sense to sleep.
>
> Would it be easier to return an error when reading the occupancy count
> too soon after configuration? On Intel it is already normal for counter
> reads to fail on newly-allocated RMIDs.
For x86, you have as many counters as there are RMIDs, so there is no issue just accessing
the counter.
With MPAM there may be as few as 1 monitor for the CSU (cache storage utilisation)
counter, which needs to be multiplexed between different PARTID to find the cache
occupancy (This works for CSU because its a stable count, it doesn't work for the
bandwidth monitors)
On such a platform the monitor needs to be allocated and programmed before it reads a
value for a particular PARTID/CLOSID. If you had two threads trying to read the same
counter, they could interleave perfectly to prevent either thread managing to read a value.
The 'not ready' time is advertised in a firmware table, and the driver will wait at most
that long before giving up and returning an error.
Clearly 1 monitor is a corner case, and I hope no-one ever builds that. But if there are
fewer monitors than there are PARTID*PMG you get the same problem, (you just need more
threads reading the counters)
Thanks,
James
Powered by blists - more mailing lists