[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3595790.aeNJFYEL58@workhorse>
Date: Tue, 16 Sep 2025 11:11:07 +0200
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Chia-I Wu <olvaffe@...il.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
Boris Brezillon <boris.brezillon@...labora.com>,
Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
Conor Dooley <conor+dt@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
MyungJoo Ham <myungjoo.ham@...sung.com>,
Kyungmin Park <kyungmin.park@...sung.com>,
Chanwoo Choi <cw00.choi@...sung.com>, Jassi Brar <jassisinghbrar@...il.com>,
Kees Cook <kees@...nel.org>, "Gustavo A. R. Silva" <gustavoars@...nel.org>,
Chen-Yu Tsai <wenst@...omium.org>, kernel@...labora.com,
dri-devel@...ts.freedesktop.org, devicetree@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org, linux-pm@...r.kernel.org,
linux-hardening@...r.kernel.org
Subject:
Re: [PATCH v2 09/10] drm/panthor: devfreq: add pluggable devfreq providers
On Tuesday, 16 September 2025 08:17:18 Central European Summer Time Chia-I Wu wrote:
> On Mon, Sep 15, 2025 at 6:34 AM Nicolas Frattaroli
> <nicolas.frattaroli@...labora.com> wrote:
> >
> > On Saturday, 13 September 2025 00:53:50 Central European Summer Time Chia-I Wu wrote:
> > > On Fri, Sep 12, 2025 at 11:38 AM Nicolas Frattaroli
> > > <nicolas.frattaroli@...labora.com> wrote:
> > > <snipped>
> > > > diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.h b/drivers/gpu/drm/panthor/panthor_devfreq.h
> > > > index a891cb5fdc34636444f141e10f5d45828fc35b51..94c9768d5d038c4ba8516929edb565a1f13443fb 100644
> > > > --- a/drivers/gpu/drm/panthor/panthor_devfreq.h
> > > > +++ b/drivers/gpu/drm/panthor/panthor_devfreq.h
> > > > @@ -8,6 +8,7 @@
> > > >
> > > > struct devfreq;
> > > > struct thermal_cooling_device;
> > > > +struct platform_device;
> > > >
> > > > struct panthor_device;
> > > >
> > > > @@ -43,6 +44,19 @@ struct panthor_devfreq {
> > > > spinlock_t lock;
> > > > };
> > > >
> > > > +struct panthor_devfreq_provider {
> > > > + /** @dev: device pointer to the provider device */
> > > > + struct device *dev;
> > > > + /**
> > > > + * @init: the provider's init callback that allocates a
> > > > + * &struct panthor_devfreq, adds it to panthor, and adds a devfreq
> > > > + * device to panthor. Will be called during panthor's probe.
> > > > + */
> > > > + int (*init)(struct panthor_device *ptdev, struct device *dev);
> > > > +
> > > > + struct list_head node;
> > > > +};
> > > On mt8196, we have performance-domains to replace several other
> > > properties: clocks, *-supply, power-domains, operating-points-v2.
> > > There are also quirks, such as GPU_SHADER_PRESENT should be masked by
> > > GF_REG_SHADER_PRESENT. It feels like that the scope of
> > > panthor_devfreq_provider is more broader, and at least the naming is
> > > not right.
> >
> > True, though I'm still not entirely sure whether mtk_mfg needs to do
> > the GF_REG_SHADER_PRESENT thing. It's entirely possible this is just
> > an efuse value the GPUEB reads and then puts in SRAM for us, and we
> > could simply read this efuse cell ourselves. Among a list of questions
> > about the hardware we're sending to MediaTek, whether this is an efuse
> > cell and where it is placed is one of them.
> >
> > If it turns out to be the case that we can simply read an efuse in
> > panthor in the other mt8196 integration code, then we can keep
> > mtk_mfg basically entirely focused on the devfreq-y part. I'd really
> > prefer this solution.
> >
> > However, assuming we can't go down this path either because this is
> > not how the hardware works, or because MediaTek never replies, or
> > because someone doesn't like reading efuses in panthor, I think
> > generalising "devfreq_provider" to "performance_controller" or
> > something like that would be a good move.
> Yeah, let's see what MTK has to say on shader core mask.
>
> Another thing is that panthor still requires a "core" clk. Is it also
> required on mt8196?
Nope, I'm planning on getting rid of it in v3.
> >
> > In a way, the fused off core mask is part of the vague domain of
> > "performance", and it'll also allow us to extend it with other
> > things relevant to performance control in different vendor integration
> > logic designs. I'm thinking memory bandwidth control and job scheduling
> > preferences. E.g. if the interconnect tells us one core is spending a
> > lot of time waiting on the interconnect, maybe because a different
> > piece of the SoC that's active shares the same path on the
> > interconnect, we could then communicate a scheduling preference for
> > the other cores that have bandwidth headroom even if they are busier
> > in compute. Maybe this doesn't make sense though because interconnect
> > designs are fully switched these days or panthor's scheduler will
> > already figure this out from job completion times.
> >
> > If any other SoC vendor or people working on hardware of those vendors
> > want to chime in and say whether they've got any other uses for
> > communicating more than just devfreq from glue logic to panthor, then
> > this would be a great time to do it, so that we can get this interface
> > right from the beginning.
> >
> > > Another issue is I am not sure if we need to expose panthor_device
> > > internals to the provider. mtk_mfg accesses very few fields of
> > > panthor_device. It seems we can make the two less coupled.
> > >
> > > I might change my view as mtk_mfg evolves and requires tigher
> > > integration with panthor. But as is, I might prefer for mtk_mfg to
> > > live under drivers/soc/mediatek and provide a header for panthor to
> > > use in soc-specific path.
> >
> > I'm not very confident it's possible to cleanly decouple them without
> > inventing a bunch of very panthor-and-mfg specific interfaces that
> > masquerade as general solutions in the process. It'd also mean I'd
> > have to duplicate all of `panthor_devfreq_get_dev_status` instead of
> > just being able to reuse it, unless that is also exposed in said
> > header file, which would need a better justification than "well there
> > is one other user of it and we're trying to couple it more loosely".
> >
> > I know that it's almost independent, but unfortunately, even a tiny
> > dependency between the two will mean that mediatek_mfg will need to
> > know about panthor.
> >
> > Other things needed from panthor are the pdevfreq->gov_data, and
> > the panthor struct device* itself, as well as stuff like "fast_rate"
> > in the panthor_device struct.
> >
> > In the future, we may want to expand this driver with governors
> > beyond SIMPLE_ONDEMAND, based on the job completion duration targets
> > we can communicate to the GPUEB. That may either make the driver
> > more tightly coupled or more loosely coupled, I don't really know
> > yet.
> >
> > One advantage of looking to completely decouple them (though again,
> > I doubt that's possible at the moment without questionable refactors)
> > could be that we could also support panfrost devices that need this.
> There is also tyr, although I don't follow its status.
>
> I can see the concern over "very panthor-and-mfg specific interfaces
> that masquerade as general solutions" or "questionable refactors". But
> I also don't like, for example, how mtk_mfg_init_devfreq inits
> panthor_devfreq manually. Beyond initialization, the remaining
> coupling comes from that we need panthor to provide get_dev_status
> callback for devfreq_dev_profile, and we need mtk_mfg to provide
> target and get_cur_freq callbacks. That seems like something solvable
> too.
Yeah I agree, I think the panthor_devfreq initialisation should happen
within panthor_devfreq. It's independent of registering the actual
devfreq device.
I'll note that down as something I will refactor. Once that's done,
I'll have a clearer picture of whether moving the driver out of
panthor is feasible.
>
> I really appreciate the work and I don't want to block it by vague
> concerns. If others have no preference, we should start with what we
> have.
Thanks! Don't worry, this is in no rush to be merged, since mainline
still doesn't have everything to boot on this platform, so we're not
in a hurry to get GPU enablement done I don't think. The important
part is getting this right in a way where panthor doesn't carry my
technical debt for years to come, so I'm grateful for the reviews.
Kind regards,
Nicolas Frattaroli
> >
> > >
> > >
> > > > +
> > > >
> > > > int panthor_devfreq_init(struct panthor_device *ptdev);
> > > >
> > > > @@ -57,4 +71,6 @@ int panthor_devfreq_get_dev_status(struct device *dev,
> > > >
> > > > unsigned long panthor_devfreq_get_freq(struct panthor_device *ptdev);
> > > >
> > > > +int panthor_devfreq_register_provider(struct panthor_devfreq_provider *prov);
> > > > +
> > > > #endif /* __PANTHOR_DEVFREQ_H__ */
> > > >
> > > > --
> > > > 2.51.0
> > > >
> > >
> >
> >
> >
> >
>
Powered by blists - more mailing lists