linux-kernel - Re: [PATCH v6 7/7] pmdomain: mediatek: Add support for MFlexGraphics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPaKu7RGPS8zoSMrNYm7-ZPivDt8UAwjJ-2YB4tdKRdYSd_amw@mail.gmail.com>
Date: Mon, 6 Oct 2025 14:04:47 -0700
From: Chia-I Wu <olvaffe@...il.com>
To: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
Cc: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>, 
	Boris Brezillon <boris.brezillon@...labora.com>, Jassi Brar <jassisinghbrar@...il.com>, 
	Chen-Yu Tsai <wenst@...omium.org>, Steven Price <steven.price@....com>, 
	Liviu Dudau <liviu.dudau@....com>, Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, 
	Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, 
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, Rob Herring <robh@...nel.org>, 
	Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley <conor+dt@...nel.org>, 
	Matthias Brugger <matthias.bgg@...il.com>, Kees Cook <kees@...nel.org>, 
	"Gustavo A. R. Silva" <gustavoars@...nel.org>, Ulf Hansson <ulf.hansson@...aro.org>, kernel@...labora.com, 
	dri-devel@...ts.freedesktop.org, devicetree@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-mediatek@...ts.infradead.org, linux-hardening@...r.kernel.org, 
	linux-pm@...r.kernel.org
Subject: Re: [PATCH v6 7/7] pmdomain: mediatek: Add support for MFlexGraphics

On Mon, Oct 6, 2025 at 7:28 AM AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com> wrote:
>
> Il 06/10/25 14:16, Nicolas Frattaroli ha scritto:
> > On Monday, 6 October 2025 13:37:28 Central European Summer Time AngeloGioacchino Del Regno wrote:
> >> Il 06/10/25 12:58, Nicolas Frattaroli ha scritto:
> >>> On Friday, 3 October 2025 23:41:16 Central European Summer Time Chia-I Wu wrote:
> >>>> On Fri, Oct 3, 2025 at 1:16 PM Nicolas Frattaroli
> >>>> <nicolas.frattaroli@...labora.com> wrote:
> >>>>>
> >>>>> Various MediaTek SoCs use GPU integration silicon named "MFlexGraphics"
> >>>>> by MediaTek. On the MT8196 and MT6991 SoCs, interacting with this
> >>>>> integration silicon is required to power on the GPU.
> >>>>>
> >>>>> This glue silicon is in the form of an embedded microcontroller running
> >>>>> special-purpose firmware, which autonomously adjusts clocks and
> >>>>> regulators.
> >>>>>
> >>>>> Implement a driver, modelled as a pmdomain driver with a
> >>>>> set_performance_state operation, to support these SoCs.
> >>>> I like this model a lot. Thanks!
> >>>>
> >>>> panthor might potentially need to interact with this driver beyond
> >>>> what pmdomain provides. I am thinking about querying
> >>>> GF_REG_SHADER_PRESENT. Not sure if we've heard back from the vendor.
> >>>
> >>> We did. The vendor confirmed this value is read by the EB firmware
> >>> from an efuse, but considers the efuse address to be confidential.
> >>> Consequently, we are not allowed to know the efuse address, or any
> >>> of the other information required to read the efuse ourselves
> >>> directly, such as what clocks and power domains it depends on.
> >>>
> >>> We therefore likely need to pass GF_REG_SHADER_PRESENT onward, but
> >>> I do have an idea for that: struct generic_pm_domain has a member
> >>> "cpumask_var_t cpus", which is there to communicate a mask of which
> >>> CPUs are attached to a power domain if the power domain has the flag
> >>> GENPD_FLAG_CPU_DOMAIN set. If the flag isn't set, the member is
> >>> unused.
> >>
> >> cpumask_var_t is not going to be the right type for anything else that is
> >> not a cpumask, as that is limited by NR_CPUS.
> >
> > Hmmm, good point, I thought that would be done by the allocation
> > but nope.
> >
> >> You'd have to declare a new bitmap, suitable for generic devices, which may
> >> get a little complicated on deciding how many bits would be enough... and
> >> if we look at GPUs... AMD and nV have lots of cores, so that becomes a bit
> >> unfeasible to put in a bitmap.
> >>
> >> Not sure then how generic that would be.
> >
> > Yeah, at this point I'm rapidly approaching "shove stuff into pmdomain
> > for no obvious pmdomain reason" territory, because we're not really
> > communicating that this pmdomain is only tied to these cores, but
> > rather that only these cores are present. Subtle difference that
> > could come bite us in the rear once some other chip has several power
> > domains that tie to different GPU shader cores.
> >
>
> I think that the only thing that we might see at some point in the future is one
> power domain per "set of shader cores", but not even sure that's really going to
> ever be a thing, as it might just not be worth implementing from a firmware
> perspective.
>
> I am guessing here - we won't ever see one power domain per core.
>
> Besides, also remember that many GPUs do have internal power management (as in,
> per-core or per-core-set shutdown) so there already is such a power saving way.
> That makes a vendor-specific implementation of that way less likely to see, even
> though.. being cautious, never say never.
>
> In any case, we can't predict the future, we can only guess - and evaluate things
> that could or could not realistically make sense.
>
> (anyway if you find a magic ball, please share, I need it for some other stuff :P)
>
> >>
> >>>
> >>> This means we could overload its meaning, e.g. with a new flag, to
> >>> communicate such masks for other purposes, since it's already the
> >>> right type and all. This would be quite a generic way for hardware
> >>> other than cpus to communicate such core masks. I was planning to
> >>> develop and send out an RFC series for this, to gauge how much Ulf
> >>> Hansson hates that approach.
> >>>
> >>> A different solution could be that mtk-mfg-pmdomain could act as an
> >>> nvmem provider, and then we integrate generic "shader_present is
> >>> stored in nvmem" support in panthor, and adjust the DT binding for
> >>> this.
> >>>
> >>> This approach would again be generic across vendors from panthor's
> >>> perspective. It would, however, leak into DT the fact that we have
> >>> to implement this in the gpufreq device, rather than having the
> >>> efuse read directly.
> >>>
> >>>> Have you considered moving this to drivers/soc/mediatek such that we
> >>>> can provide include/linux/mtk-mfg.h to panthor?
> >>>
> >>> Having panthor read data structures from mtk-mfg-pmdomain would be a
> >>> last resort for me if none of the other approaches work out, as I'm
> >>> not super keen on adding vendor-specific code paths to panthor
> >>> itself. A new generic code path in panthor that is only used by one
> >>> vendor for now is different in that it has the potential to be used
> >>> by a different vendor's integration logic in the future as well.
> >>>
> >>> So for now I'd like to keep it out of public includes and panthor as
> >>> much as possible, unless the two other approaches don't work out for
> >>> us.
> >>>
> >>
> >> I don't really like seeing more and more vendor specific APIs: MediaTek does
> >> suffer quite a lot from that, with cmdq being one of the examples - and the
> >> fact that it's not just MediaTek having those, but also others like Qualcomm,
> >> Rockchip, etc, is not an excuse to keep adding new ones when there are other
> >> alternatives.
> >>
> >> Also another fact there is that I don't think that panthor should get any
> >> vendor specific "things" added (I mean, that should be avoided as much as
> >> possible).
> >
> > The big issue to me is that vendors will always prefer to shoehorn
> > more vendor specific hacks into panthor, because the alternative is
> > to tell us how the hardware actually works. Which they all hate
> > doing.
>
> That's a bit too much pessimistic... I hope.
>
> > I definitely agree that we should work from the assumption
> > that panthor can support a Mali implementation without adding too
> > much special code for it, because in 10 years there will still be
> > new devices that use panthor as a driver, but few people will still
> > be testing MT8196 codepaths within panthor, which makes refactoring
> > prone to subtle breakage.
>
> I had no doubt that you were thinking alike, but happy to see that confirmed.
>
> >
> > Not to mention that we don't want to rewrite all the vendor specific
> > code for Tyr.
> >
> >> That said - what you will be trying to pass is really a value that is read
> >> from eFuse, with the EB firmware being a wrapper over that: if we want, we
> >> could see that yet-another-way of interfacing ourselves with reading nvmem
> >> where, instead of a direct MMIO read, we're asking a firmware to give us a
> >> readout.
> >>
> >> This leads me to think that one of the possible options could be to actually
> >> register (perhaps as a new platform device, because I'm not sure that it could
> >> be feasible to register a pmdomain driver as a nvmem provider, but ultimately
> >> that is Ulf and Srinivas' call I guess) a nvmem driver that makes an IPI call
> >> to GPUEB and gives back the value to panthor through generic bindings.
> >
> > Lee Jones will probably tell me to use MFD instead and that I'm silly
> > for not using MFD, so we might as well. Should I do that for v7 or
> > should v7 be less disruptive? Also, would I fillet out the clock
> > provider stuff into an MFD cell as well, or is that too much?
> >
> > Also, nb: there is no IPI call for getting the SHADER_PRESENT value
> > that we know of. It's a location in the reserved shared memory
> > populated by the EB during the shared mem init, which ideally isn't
> > done multiple times by multiple drivers because that's dumb.
> >
> > On the other hand, I don't really know what we get out of splitting
> > this up into several drivers, other than a more pleasing directory
> > structure and get_maintainers picking up the right subsystem people.
> >
>
> I'm not sure. A power controller being also a clock provider isn't entirely
> uncommon (look at i.MX8 MP), but then think about it: if you add a MFD, you
> are still introducing vendor APIs around... as you'd need a way to do your
> piece of communication with the EB.
>
> The benefit, then, is only what you just said.
>
> There are literally too many alternatives to do the very same as what you're
> doing here, including having a (firmware|soc)/mediatek-gpueb.c driver managing
> only the communication part, and the rest all in small different drivers, or...
>
> ...you could share the reserved-memory between the two drivers, and have the efuse
> driver getting a power domain from mtk-mfg-pmdomain (to check and call mfg power
> on), then reading the byte(s) that you need from GF_REG_SHADER_PRESENT from there.
>
> Not sure then what's the best option.
>
> One thing I'm sure about is that you're setting how everything works *now*, and
> changing that later is going to cause lots of pain and lots of suffering, so a
> decision must be taken right now.
If mtk-mfg registers a nvmem cell, I guess all panthor needs to do is
to handle something like:

  nvmem-cells = <&gpueb_shmem_shader_present>;
  nvmem-cell-names = "shader-present";

That sounds like a reasonable generalization from panthor's point of view.

>
> >>>>>
> >>>>> The driver also exposes the actual achieved clock rate, as read back
> >>>>> from the MCU, as common clock framework clocks, by acting as a clock
> >>>>> provider as well.
> >>>>>
> >>>>> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> >>>>> ---
> >>>>>    drivers/pmdomain/mediatek/Kconfig            |   16 +
> >>>>>    drivers/pmdomain/mediatek/Makefile           |    1 +
> >>>>>    drivers/pmdomain/mediatek/mtk-mfg-pmdomain.c | 1027 ++++++++++++++++++++++++++
> >>>>>    3 files changed, 1044 insertions(+)
> >>>> [...]
> >>>>> +static int mtk_mfg_init_shared_mem(struct mtk_mfg *mfg)
> >>>>> +{
> >>>>> +       struct device *dev = &mfg->pdev->dev;
> >>>>> +       struct mtk_mfg_ipi_msg msg = {};
> >>>>> +       int ret;
> >>>>> +
> >>>>> +       dev_dbg(dev, "clearing GPUEB shared memory, 0x%X bytes\n", mfg->shared_mem_size);
> >>>>> +       memset_io(mfg->shared_mem, 0, mfg->shared_mem_size);
> >>>>> +
> >>>>> +       msg.cmd = CMD_INIT_SHARED_MEM;
> >>>>> +       msg.u.shared_mem.base = mfg->shared_mem_phys;
> >>>>> +       msg.u.shared_mem.size = mfg->shared_mem_size;
> >>>>> +
> >>>>> +       ret = mtk_mfg_send_ipi(mfg, &msg);
> >>>>> +       if (ret)
> >>>>> +               return ret;
> >>>>> +
> >>>>> +       if (readl(mfg->shared_mem) != GPUEB_MEM_MAGIC) {
> >>>> Add the offset GF_REG_MAGIC, even though it is 0.
> >>>
> >>> Good catch, will do!
> >>>
> >>>>
> >>>>> +               dev_err(dev, "EB did not initialise shared memory correctly\n");
> >>>>> +               return -EIO;
> >>>>> +       }
> >>>>> +
> >>>>> +       return 0;
> >>>>> +}
> >>>> [...]
> >>>>> +static int mtk_mfg_mt8196_init(struct mtk_mfg *mfg)
> >>>>> +{
> >>>>> +       void __iomem *e2_base;
> >>>>> +
> >>>>> +       e2_base = devm_platform_ioremap_resource_byname(mfg->pdev, "hw-revision");
> >>>>> +       if (IS_ERR(e2_base))
> >>>>> +               return dev_err_probe(&mfg->pdev->dev, PTR_ERR(e2_base),
> >>>>> +                                    "Couldn't get hw-revision register\n");
> >>>>> +
> >>>>> +       if (readl(e2_base) == MFG_MT8196_E2_ID)
> >>>>> +               mfg->ghpm_en_reg = RPC_DUMMY_REG_2;
> >>>>> +       else
> >>>>> +               mfg->ghpm_en_reg = RPC_GHPM_CFG0_CON;
> >>>>> +
> >>>>> +       return 0;
> >>>>> +};
> >>>> Extraneous semicolon.
> >>>
> >>> Good catch, will fix!
> >>>
> >>>>
> >>>>> +static int mtk_mfg_init_mbox(struct mtk_mfg *mfg)
> >>>>> +{
> >>>>> +       struct device *dev = &mfg->pdev->dev;
> >>>>> +       struct mtk_mfg_mbox *gf;
> >>>>> +       struct mtk_mfg_mbox *slp;
> >>>>> +
> >>>>> +       gf = devm_kzalloc(dev, sizeof(*gf), GFP_KERNEL);
> >>>>> +       if (!gf)
> >>>>> +               return -ENOMEM;
> >>>>> +
> >>>>> +       gf->rx_data = devm_kzalloc(dev, GPUEB_MBOX_MAX_RX_SIZE, GFP_KERNEL);
> >>>> It looks like gfx->rx_data can simply be "struct mtk_mfg_ipi_msg rx_data;".
> >>>
> >>> Hmmm, good point. I'll change it to that.
> >>>
> >>
> >> Honestly, I prefer the current version. No strong opinions though.
> >
> > And I just realised you're sorta right in that; struct mtk_mfg_mbox is
> > a type used by both the gpufreq mbox and the sleep mbox. The struct
> > mtk_mfg_ipi_msg type is only the right type to use for the gpufreq
> > mbox. By making rx_data a `struct mtk_mfg_ipi_msg` type, we're
> > allocating it for both channels, and in the case of the sleep mailbox,
> > it's the wrong type to boot (though not like sleep replies).
> >
> > So yeah I think I'll keep the current construct. If this driver grows
> > another limb in the future that talks to yet another mailbox channel,
> > we'll appreciate not having to untangle that.
>
> ...that was the implicit reasoning around my statement, yes.
Sounds good.
>
> Cheers,
> Angelo
>
> >
> >> [...]
> >
> > Kind regards,
> > Nicolas Frattaroli
> >
> >
>
>