[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8bf2275a-dea7-1817-731a-7d47d3b01d13@os.amperecomputing.com>
Date: Tue, 6 Feb 2024 13:04:27 -0800 (PST)
From: Ilkka Koskinen <ilkka@...amperecomputing.com>
To: Robin Murphy <robin.murphy@....com>
cc: Ilkka Koskinen <ilkka@...amperecomputing.com>,
Will Deacon <will@...nel.org>, Mark Rutland <mark.rutland@....com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1
(incorrect child count)
On Tue, 6 Feb 2024, Robin Murphy wrote:
> On 2024-02-05 7:46 pm, Ilkka Koskinen wrote:
>> AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
>> report incorrect child count. The failing crosspoints report 8 children
>> while they only have two.
>
> Ooh, fun :)
>
>> When the driver tries to access the inexistent child nodes, it believes it
>> has reached an invalid node type and probing fails. The workaround is to
>> ignore those incorrect child nodes and continue normally.
>>
>> Signed-off-by: Ilkka Koskinen <ilkka@...amperecomputing.com>
>> ---
>> drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++
>> 1 file changed, 25 insertions(+)
>>
>> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
>> index c584165b13ba..97fed8ec3693 100644
>> --- a/drivers/perf/arm-cmn.c
>> +++ b/drivers/perf/arm-cmn.c
>> @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum
>> cmn_node_type type)
>> }
>> }
>> +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn,
>> + struct arm_cmn_node *dn,
>> + u16 child_count, int child)
>> +{
>> + /*
>> + * The bug occurs only when a crosspoint reports 8 children
>> + * while it only has two HN-P child nodes.
>> + */
>> + dn -= 2;
>> +
>> + if (arm_cmn_model(cmn) == CMN650 && child_count == 8 &&
>> + child == 2 && dn->type == CMN_TYPE_HNP)
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
>> {
>> void __iomem *cfg_region;
>> @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn,
>> unsigned int rgn_offset)
>> for (j = 0; j < child_count; j++) {
>> reg = readq_relaxed(xp_region + child_poff + j * 8);
>> + if (reg == 0)
>> + if (arm_cmn_is_ampereonex_bug(cmn, dn,
>> child_count, j))
>> + /*
>> + * We know there are only two real
>> children and the rest 6
>> + * are inexistent. Thus, we can skip
>> the rest of the loop
>> + */
>> + break;
>> +
>
> TBH I don't see much harm in taking an even simpler approach, so I'd be
> inclined to not bother being all that specific beyond documenting it,
> something like the below:
Sounds good to me.
>
> Cheers,
> Robin.
>
> ----->8-----
>
> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
> index c584165b13ba..7e3aa7e2345f 100644
> --- a/drivers/perf/arm-cmn.c
> +++ b/drivers/perf/arm-cmn.c
> @@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn,
> unsigned int rgn_offset)
> dev_dbg(cmn->dev, "ignoring external node
> %llx\n", reg);
> continue;
> }
> + /*
> + * AmpereOneX erratum AC04_MESH_1 makes some XPs
> report a bogus
> + * child count larger than the number of valid child
> pointers.
> + * A child offset of 0 can only occur on CMN-600;
> otherwise it
> + * would imply the root node being its own
> grandchild, which
> + * we can safely dismiss in general.
> + */
> + if (reg == 0 && cmn->part != PART_CMN600) {
> + dev_dbg(cmn->dev, "bogus child pointer?\n");
> + continue;
> + }
> arm_cmn_init_node_info(cmn, reg &
> CMN_CHILD_NODE_ADDR, dn);
>
Tested-by: Ilkka Koskinen <ilkka@...amperecomputing.com>
Cheers, Ilkka
Powered by blists - more mailing lists