linux-kernel - Re: [PATCH v6 7/7] pmdomain: mediatek: Add support for MFlexGraphics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4457422.ejJDZkT8p0@workhorse>
Date: Mon, 06 Oct 2025 14:16:16 +0200
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Chia-I Wu <olvaffe@...il.com>,
 AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
Cc: Boris Brezillon <boris.brezillon@...labora.com>,
 Jassi Brar <jassisinghbrar@...il.com>, Chen-Yu Tsai <wenst@...omium.org>,
 Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
 Conor Dooley <conor+dt@...nel.org>,
 Matthias Brugger <matthias.bgg@...il.com>, Kees Cook <kees@...nel.org>,
 "Gustavo A. R. Silva" <gustavoars@...nel.org>,
 Ulf Hansson <ulf.hansson@...aro.org>, kernel@...labora.com,
 dri-devel@...ts.freedesktop.org, devicetree@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 linux-mediatek@...ts.infradead.org, linux-hardening@...r.kernel.org,
 linux-pm@...r.kernel.org
Subject: Re: [PATCH v6 7/7] pmdomain: mediatek: Add support for MFlexGraphics

On Monday, 6 October 2025 13:37:28 Central European Summer Time AngeloGioacchino Del Regno wrote:
> Il 06/10/25 12:58, Nicolas Frattaroli ha scritto:
> > On Friday, 3 October 2025 23:41:16 Central European Summer Time Chia-I Wu wrote:
> >> On Fri, Oct 3, 2025 at 1:16 PM Nicolas Frattaroli
> >> <nicolas.frattaroli@...labora.com> wrote:
> >>>
> >>> Various MediaTek SoCs use GPU integration silicon named "MFlexGraphics"
> >>> by MediaTek. On the MT8196 and MT6991 SoCs, interacting with this
> >>> integration silicon is required to power on the GPU.
> >>>
> >>> This glue silicon is in the form of an embedded microcontroller running
> >>> special-purpose firmware, which autonomously adjusts clocks and
> >>> regulators.
> >>>
> >>> Implement a driver, modelled as a pmdomain driver with a
> >>> set_performance_state operation, to support these SoCs.
> >> I like this model a lot. Thanks!
> >>
> >> panthor might potentially need to interact with this driver beyond
> >> what pmdomain provides. I am thinking about querying
> >> GF_REG_SHADER_PRESENT. Not sure if we've heard back from the vendor.
> > 
> > We did. The vendor confirmed this value is read by the EB firmware
> > from an efuse, but considers the efuse address to be confidential.
> > Consequently, we are not allowed to know the efuse address, or any
> > of the other information required to read the efuse ourselves
> > directly, such as what clocks and power domains it depends on.
> > 
> > We therefore likely need to pass GF_REG_SHADER_PRESENT onward, but
> > I do have an idea for that: struct generic_pm_domain has a member
> > "cpumask_var_t cpus", which is there to communicate a mask of which
> > CPUs are attached to a power domain if the power domain has the flag
> > GENPD_FLAG_CPU_DOMAIN set. If the flag isn't set, the member is
> > unused.
> 
> cpumask_var_t is not going to be the right type for anything else that is
> not a cpumask, as that is limited by NR_CPUS.

Hmmm, good point, I thought that would be done by the allocation
but nope.
 
> You'd have to declare a new bitmap, suitable for generic devices, which may
> get a little complicated on deciding how many bits would be enough... and
> if we look at GPUs... AMD and nV have lots of cores, so that becomes a bit
> unfeasible to put in a bitmap.
> 
> Not sure then how generic that would be.

Yeah, at this point I'm rapidly approaching "shove stuff into pmdomain
for no obvious pmdomain reason" territory, because we're not really
communicating that this pmdomain is only tied to these cores, but
rather that only these cores are present. Subtle difference that
could come bite us in the rear once some other chip has several power
domains that tie to different GPU shader cores.

> 
> > 
> > This means we could overload its meaning, e.g. with a new flag, to
> > communicate such masks for other purposes, since it's already the
> > right type and all. This would be quite a generic way for hardware
> > other than cpus to communicate such core masks. I was planning to
> > develop and send out an RFC series for this, to gauge how much Ulf
> > Hansson hates that approach.
> > 
> > A different solution could be that mtk-mfg-pmdomain could act as an
> > nvmem provider, and then we integrate generic "shader_present is
> > stored in nvmem" support in panthor, and adjust the DT binding for
> > this.
> > 
> > This approach would again be generic across vendors from panthor's
> > perspective. It would, however, leak into DT the fact that we have
> > to implement this in the gpufreq device, rather than having the
> > efuse read directly.
> > 
> >> Have you considered moving this to drivers/soc/mediatek such that we
> >> can provide include/linux/mtk-mfg.h to panthor?
> > 
> > Having panthor read data structures from mtk-mfg-pmdomain would be a
> > last resort for me if none of the other approaches work out, as I'm
> > not super keen on adding vendor-specific code paths to panthor
> > itself. A new generic code path in panthor that is only used by one
> > vendor for now is different in that it has the potential to be used
> > by a different vendor's integration logic in the future as well.
> > 
> > So for now I'd like to keep it out of public includes and panthor as
> > much as possible, unless the two other approaches don't work out for
> > us.
> > 
> 
> I don't really like seeing more and more vendor specific APIs: MediaTek does
> suffer quite a lot from that, with cmdq being one of the examples - and the
> fact that it's not just MediaTek having those, but also others like Qualcomm,
> Rockchip, etc, is not an excuse to keep adding new ones when there are other
> alternatives.
> 
> Also another fact there is that I don't think that panthor should get any
> vendor specific "things" added (I mean, that should be avoided as much as
> possible).

The big issue to me is that vendors will always prefer to shoehorn
more vendor specific hacks into panthor, because the alternative is
to tell us how the hardware actually works. Which they all hate
doing. I definitely agree that we should work from the assumption
that panthor can support a Mali implementation without adding too
much special code for it, because in 10 years there will still be
new devices that use panthor as a driver, but few people will still
be testing MT8196 codepaths within panthor, which makes refactoring
prone to subtle breakage.

Not to mention that we don't want to rewrite all the vendor specific
code for Tyr.

> That said - what you will be trying to pass is really a value that is read
> from eFuse, with the EB firmware being a wrapper over that: if we want, we
> could see that yet-another-way of interfacing ourselves with reading nvmem
> where, instead of a direct MMIO read, we're asking a firmware to give us a
> readout.
> 
> This leads me to think that one of the possible options could be to actually
> register (perhaps as a new platform device, because I'm not sure that it could
> be feasible to register a pmdomain driver as a nvmem provider, but ultimately
> that is Ulf and Srinivas' call I guess) a nvmem driver that makes an IPI call
> to GPUEB and gives back the value to panthor through generic bindings.

Lee Jones will probably tell me to use MFD instead and that I'm silly
for not using MFD, so we might as well. Should I do that for v7 or
should v7 be less disruptive? Also, would I fillet out the clock
provider stuff into an MFD cell as well, or is that too much?

Also, nb: there is no IPI call for getting the SHADER_PRESENT value
that we know of. It's a location in the reserved shared memory
populated by the EB during the shared mem init, which ideally isn't
done multiple times by multiple drivers because that's dumb.

On the other hand, I don't really know what we get out of splitting
this up into several drivers, other than a more pleasing directory
structure and get_maintainers picking up the right subsystem people.

> >>>
> >>> The driver also exposes the actual achieved clock rate, as read back
> >>> from the MCU, as common clock framework clocks, by acting as a clock
> >>> provider as well.
> >>>
> >>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> >>> ---
> >>>   drivers/pmdomain/mediatek/Kconfig            |   16 +
> >>>   drivers/pmdomain/mediatek/Makefile           |    1 +
> >>>   drivers/pmdomain/mediatek/mtk-mfg-pmdomain.c | 1027 ++++++++++++++++++++++++++
> >>>   3 files changed, 1044 insertions(+)
> >> [...]
> >>> +static int mtk_mfg_init_shared_mem(struct mtk_mfg *mfg)
> >>> +{
> >>> +       struct device *dev = &mfg->pdev->dev;
> >>> +       struct mtk_mfg_ipi_msg msg = {};
> >>> +       int ret;
> >>> +
> >>> +       dev_dbg(dev, "clearing GPUEB shared memory, 0x%X bytes\n", mfg->shared_mem_size);
> >>> +       memset_io(mfg->shared_mem, 0, mfg->shared_mem_size);
> >>> +
> >>> +       msg.cmd = CMD_INIT_SHARED_MEM;
> >>> +       msg.u.shared_mem.base = mfg->shared_mem_phys;
> >>> +       msg.u.shared_mem.size = mfg->shared_mem_size;
> >>> +
> >>> +       ret = mtk_mfg_send_ipi(mfg, &msg);
> >>> +       if (ret)
> >>> +               return ret;
> >>> +
> >>> +       if (readl(mfg->shared_mem) != GPUEB_MEM_MAGIC) {
> >> Add the offset GF_REG_MAGIC, even though it is 0.
> > 
> > Good catch, will do!
> > 
> >>
> >>> +               dev_err(dev, "EB did not initialise shared memory correctly\n");
> >>> +               return -EIO;
> >>> +       }
> >>> +
> >>> +       return 0;
> >>> +}
> >> [...]
> >>> +static int mtk_mfg_mt8196_init(struct mtk_mfg *mfg)
> >>> +{
> >>> +       void __iomem *e2_base;
> >>> +
> >>> +       e2_base = devm_platform_ioremap_resource_byname(mfg->pdev, "hw-revision");
> >>> +       if (IS_ERR(e2_base))
> >>> +               return dev_err_probe(&mfg->pdev->dev, PTR_ERR(e2_base),
> >>> +                                    "Couldn't get hw-revision register\n");
> >>> +
> >>> +       if (readl(e2_base) == MFG_MT8196_E2_ID)
> >>> +               mfg->ghpm_en_reg = RPC_DUMMY_REG_2;
> >>> +       else
> >>> +               mfg->ghpm_en_reg = RPC_GHPM_CFG0_CON;
> >>> +
> >>> +       return 0;
> >>> +};
> >> Extraneous semicolon.
> > 
> > Good catch, will fix!
> > 
> >>
> >>> +static int mtk_mfg_init_mbox(struct mtk_mfg *mfg)
> >>> +{
> >>> +       struct device *dev = &mfg->pdev->dev;
> >>> +       struct mtk_mfg_mbox *gf;
> >>> +       struct mtk_mfg_mbox *slp;
> >>> +
> >>> +       gf = devm_kzalloc(dev, sizeof(*gf), GFP_KERNEL);
> >>> +       if (!gf)
> >>> +               return -ENOMEM;
> >>> +
> >>> +       gf->rx_data = devm_kzalloc(dev, GPUEB_MBOX_MAX_RX_SIZE, GFP_KERNEL);
> >> It looks like gfx->rx_data can simply be "struct mtk_mfg_ipi_msg rx_data;".
> > 
> > Hmmm, good point. I'll change it to that.
> > 
> 
> Honestly, I prefer the current version. No strong opinions though.

And I just realised you're sorta right in that; struct mtk_mfg_mbox is
a type used by both the gpufreq mbox and the sleep mbox. The struct
mtk_mfg_ipi_msg type is only the right type to use for the gpufreq
mbox. By making rx_data a `struct mtk_mfg_ipi_msg` type, we're
allocating it for both channels, and in the case of the sleep mailbox,
it's the wrong type to boot (though not like sleep replies).

So yeah I think I'll keep the current construct. If this driver grows
another limb in the future that talks to yet another mailbox channel,
we'll appreciate not having to untangle that.

> [...]

Kind regards,
Nicolas Frattaroli