[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0f2993ff-75a1-4d63-9384-5c949bf2e8e0@arm.com>
Date: Mon, 14 Jul 2025 11:15:44 +0100
From: James Morse <james.morse@....com>
To: Gavin Shan <gshan@...hat.com>, linux-kernel@...r.kernel.org
Cc: sdonthineni@...dia.com, rohit.mathew@....com,
carl@...amperecomputing.com, shan.gavin@...il.com
Subject: Re: [PATCH MPAM 6.16.rc1] arm_mpam: Enforce to recalculate mbw_min on
configuration
Hi Gavin,
On 26/06/2025 10:52, Gavin Shan wrote:
> mpam_feat_mbw_max is exposed to user by resctrlfs, but mpam_feat_mbw_min
> should be recalculated based on the maximal memory bandwidth in every
> configuration, which has been missed unfortunately. When a new group is
> created, the default and maximal memory bandwidth percentage (100%) is
> configured, the configuration instance (struct mpam_config) on the stack
> is updated, including the minimal and maximal memory bandwidth. It's
> copied to the domain's configuration instance. On next time when user
> tries to configure by writing 'schemata', the minimal memory bandwidth
> isn't never recalculated because mpam_feat_mbw_min has been seen in the
> configuration, which inherits from the domain's instance.
>
> For example, the value of register MPAMCFG_MBW_MIN is never changed no
> matter what configuration is set.
Thanks for catching this. I think this was introduced by the patch that picked
the component copy of the configuration, modified it - and passed it back into
the driver. That had the additional side effect of changes going missing as the
mpam_update_config() had the same config twice...
I fixed that shortly after rc1, with that fixed I don't think this can happen -
resctrl never sets the 'min', meaning mpam_extent_config() will always regenerate
the value.
This shouldn't happen with the latest version of the tree.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpam/snapshot/v6.16-rc4
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
b/drivers/platform/arm64/mpam/mpam_devices.c
> index df8730491de2..4845dcb8e601 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -3192,6 +3192,9 @@ static void mpam_extend_config(struct mpam_class *class, struct
mpam_config *cfg
> u16 min, min_hw_granule, delta;
> u16 max_hw_value, res0_bits;
>
> + if (!mpam_has_feature(mpam_feat_mbw_max, cfg))
> + return;
N.B. - The T421 quirk needs to ensure the minimum value is never set to 0 - which is what
mpam_reprogram_ris_partid() will do if there is no minimum config set.
The aim of the quirk is that on affected platforms the mbm_min will always be set, and
never be zero in any configuration. Whether it then gets applied to the hardware depends
on whether the hardware has the feature.
This makes it simpler to think about - and easier to test on platforms that aren't
affected by the errata.
I'd prefer not to to try and spot configurations where its going to matter as it will be
hard to debug the workaround going wrong!
> @@ -3211,23 +3214,17 @@ static void mpam_extend_config(struct mpam_class *class, struct
mpam_config *cfg
> *
> * Resctrl can only configure the MAX.
> */
> - if (mpam_has_feature(mpam_feat_mbw_max, cfg) &&
> - !mpam_has_feature(mpam_feat_mbw_min, cfg)) {
> - delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
> - if (cfg->mbw_max > delta)
> - min = cfg->mbw_max - delta;
> - else
> - min = 0;
> + delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
> + if (cfg->mbw_max > delta)
> + min = cfg->mbw_max - delta;
> + else
> + min = 0;
>
> - cfg->mbw_min = max(min, min_hw_granule);
> - mpam_set_feature(mpam_feat_mbw_min, cfg);
> - }
> + cfg->mbw_min = max(min, min_hw_granule);
While resctrl doesn't use it - the idea is the mpam_devices interface should support any
mpam control being set. This overwrites what the user asked for.
The devices/resctrl split might look strange - but its so the resctrl ABI implications are
contained to one file, and another user of the mpam_devices interface can be added later.
(KVM in-kernel MSC emulation was one proposal, another came from people who believe they
need to change the hardware policy more frequently than they can with a syscall ... I
don't like either of these!)
Thanks,
James
Powered by blists - more mailing lists