[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF6AEGtbw06-gOSvX9gAbi=SA801gmD3_8c5xkOU-G9g2qKptQ@mail.gmail.com>
Date: Thu, 4 Jul 2024 08:58:25 -0700
From: Rob Clark <robdclark@...il.com>
To: Bibek Kumar Patro <quic_bibekkum@...cinc.com>
Cc: will@...nel.org, robin.murphy@....com, joro@...tes.org, jgg@...pe.ca,
jsnitsel@...hat.com, robh@...nel.org, krzysztof.kozlowski@...aro.org,
quic_c_gdjako@...cinc.com, dmitry.baryshkov@...aro.org,
konrad.dybcio@...aro.org, iommu@...ts.linux.dev,
linux-arm-msm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, Rob Clark <robdclark@...omium.org>
Subject: Re: [PATCH v13 6/6] iommu/arm-smmu: add support for PRR bit setup
On Thu, Jul 4, 2024 at 7:46 AM Rob Clark <robdclark@...il.com> wrote:
>
> On Wed, Jul 3, 2024 at 4:38 AM Bibek Kumar Patro
> <quic_bibekkum@...cinc.com> wrote:
> >
> >
> >
> > On 7/2/2024 2:01 AM, Rob Clark wrote:
> > > On Mon, Jul 1, 2024 at 4:01 AM Bibek Kumar Patro
> > > <quic_bibekkum@...cinc.com> wrote:
> > >>
> > >>
> > >>
> > >> On 6/28/2024 9:14 PM, Rob Clark wrote:
> > >>> On Fri, Jun 28, 2024 at 8:10 AM Bibek Kumar Patro
> > >>> <quic_bibekkum@...cinc.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 6/28/2024 7:47 PM, Rob Clark wrote:
> > >>>>> On Fri, Jun 28, 2024 at 7:05 AM Bibek Kumar Patro
> > >>>>> <quic_bibekkum@...cinc.com> wrote:
> > >>>>>>
> > >>>>>> Add an adreno-smmu-priv interface for drm/msm to call
> > >>>>>> into arm-smmu-qcom and initiate the PRR bit setup or reset
> > >>>>>> sequence as per request.
> > >>>>>>
> > >>>>>> This will be used by GPU to setup the PRR bit and related
> > >>>>>> configuration registers through adreno-smmu private
> > >>>>>> interface instead of directly poking the smmu hardware.
> > >>>>>>
> > >>>>>> Suggested-by: Rob Clark <robdclark@...il.com>
> > >>>>>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@...cinc.com>
> > >>>>>> ---
> > >>>>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 23 ++++++++++++++++++++++
> > >>>>>> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++
> > >>>>>> include/linux/adreno-smmu-priv.h | 6 +++++-
> > >>>>>> 3 files changed, 30 insertions(+), 1 deletion(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >>>>>> index bd101a161d04..64571a1c47b8 100644
> > >>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> > >>>>>> @@ -28,6 +28,7 @@
> > >>>>>> #define PREFETCH_SHALLOW (1 << PREFETCH_SHIFT)
> > >>>>>> #define PREFETCH_MODERATE (2 << PREFETCH_SHIFT)
> > >>>>>> #define PREFETCH_DEEP (3 << PREFETCH_SHIFT)
> > >>>>>> +#define GFX_ACTLR_PRR (1 << 5)
> > >>>>>>
> > >>>>>> static const struct actlr_config sc7280_apps_actlr_cfg[] = {
> > >>>>>> { 0x0800, 0x04e0, PREFETCH_DEFAULT | CMTLB },
> > >>>>>> @@ -235,6 +236,27 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina
> > >>>>>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg);
> > >>>>>> }
> > >>>>>>
> > >>>>>> +static void qcom_adreno_smmu_set_prr(const void *cookie, phys_addr_t page_addr, bool set)
> > >>>>>> +{
> > >>>>>> + struct arm_smmu_domain *smmu_domain = (void *)cookie;
> > >>>>>> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > >>>>>> + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >>>>>> + u32 reg = 0;
> > >>>>>> +
> > >>>>>> + writel_relaxed(lower_32_bits(page_addr),
> > >>>>>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> > >>>>>> +
> > >>>>>> + writel_relaxed(upper_32_bits(page_addr),
> > >>>>>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> > >>>>>> +
> > >>>>>> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR);
> > >>>>>> + reg &= ~GFX_ACTLR_PRR;
> > >>>>>> + if (set)
> > >>>>>> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1);
> > >>>>>> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> > >>>>>> +
> > >>>>>
> > >>>>> nit, extra line
> > >>>>>
> > >>>>
> > >>>> Ack, will remove this. Thanks for pointing out.
> > >>>>
> > >>>>> Also, if you passed a `struct page *` instead, then you could drop the
> > >>>>> bool param, ie. passing NULL for the page would disable PRR. But I
> > >>>>> can go either way if others have a strong preference for phys_addr_t.
> > >>>>>
> > >>>>
> > >>>> Oh okay, this looks simple to reset the prr bit.
> > >>>> But since this page is allocated and is used inside gfx driver
> > >>>> before being utilized for prr bit operation, would it be safe for
> > >>>> drm/gfx driver to keep a reference to this page in smmu driver?
> > >>>>
> > >>>> Since we only need the page address for configuring the
> > >>>> CFG_UADDR/CFG_LADDR registers so passed the phys_addr_t.
> > >>>
> > >>> I don't think the smmu driver needs to keep a reference to the page..
> > >>> we can just say it is the responsibility of the drm driver to call
> > >>> set_prr(NULL) before freeing the page
> > >>>
> > >>
> > >> That makes sense. If we go by this NULL page method to disable the PRR,
> > >> we would have to set the address registers to reset value as well.
> > >>
> > >> The sequence would be like the following as per my understaning:
> > >> - Check if it's NULL page
> > >> - Set the PRR_CFG_UADDR/PRR_CFG_LADDR to reset values i.e - 0x0 for
> > >> these registers
> > >> - Reset the PRR bit in actlr register
> > >>
> > >> Similar to this snippet:
> > >>
> > >> #PRR_RESET_ADDR 0x0
> > >>
> > >> --------------
> > >> reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR);
> > >> reg &= ~GFX_ACTLR_PRR;
> > >> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> > >>
> > >> if (!prr_page) {
> > >> writel_relaxed(PRR_RESET_ADDR,
> > >> smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> > >> writel_relaxed(PRR_RESET_ADDR),
> > >> smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> > >> return;
> > >> }
> > >>
> > >>
> > >> writel_relaxed(lower_32_bits(page_to_phys(prr_page)),
> > >> smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR);
> > >>
> > >> writel_relaxed(upper_32_bits(page_to_phys(prr_page)),
> > >> smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR);
> > >>
> > >> reg |= FIELD_PREP(GFX_ACTLR_PRR, 1);
> > >> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg);
> > >> -----------------
> > >>
> > >> If looks good, will implement the same in next version.
> > >
> > > yeah, that looks like it could work..
> > >
> > > you probably don't need to zero out the PRR_CFG_*ADDR when disabling,
> > > and probably could avoid double writing ACTLR, but that is getting
> > > into bikeshedding
> > >
> >
> > Actually Rob, since you rightly pointed this out.
> > I crosschecked again on these registers.
> > PRR_CFG_*ADDR is a global register in SMMU space but
> > ACTLR register including PRR bit is a per-domain register.
> > There might also be a situation where PRR feature need to be
> > disabled or enabled separately for each domain.
> > So I think it would be cleaner to have two apis, set_prr_addr(),
> > set_prr_bit().
> > set_prr_addr() will be used only to set this PRR_CFG_*ADDR
> > register by passing a 'struct page *'
> > set_prr_bit() will be used as a switch for PRR feature,
> > where required smmu_domain will be passed along with
> > the bool value to set/reset the PRR bit depending on which this
> > feature will be enabled/disabled for the selected domain.
>
> on a related note, adreno has been using arm-smmu for a number of
> generations, I guess not all support PRR? The drm driver will need to
> know whether PRR is supported (and expose that to userspace to let the
> UMD know whether to expose certain extensions). How should this work?
So, I noticed in the x1e80100.dtsi that there is a gpu_prr_mem
reserved section.. maybe we should be connecting this to the smmu
driver in dt, and using that to detect presence of PRR? Ie. the smmu
driver would configure PRR_CFG_*ADDR based on the reserved mem, and
the interface to drm would just be to enable/disable PRR, returning an
error code if the reserved mem section isn't there.
This simplifies the interface, and handles the question of how to
detect if PRR is supported.
BR,
-R
> BR,
> -R
>
> > Thanks & regards,
> > Bibek
> >
> > > BR,
> > > -R
> > >
> > >>
> > >> Thanks & regards,
> > >> Bibek
> > >>
> > >>> BR,
> > >>> -R
> > >>>
> > >>>>> Otherwise, lgtm
> > >>>>>
> > >>>>> BR,
> > >>>>> -R
> > >>>>>
> > >>>>
> > >>>> Thanks & regards,
> > >>>> Bibek
> > >>>>
> > >>>>>> +}
> > >>>>>> +
> > >>>>>> #define QCOM_ADRENO_SMMU_GPU_SID 0
> > >>>>>>
> > >>>>>> static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
> > >>>>>> @@ -407,6 +429,7 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> > >>>>>> priv->get_fault_info = qcom_adreno_smmu_get_fault_info;
> > >>>>>> priv->set_stall = qcom_adreno_smmu_set_stall;
> > >>>>>> priv->resume_translation = qcom_adreno_smmu_resume_translation;
> > >>>>>> + priv->set_prr = qcom_adreno_smmu_set_prr;
> > >>>>>>
> > >>>>>> actlrvar = qsmmu->data->actlrvar;
> > >>>>>> if (!actlrvar)
> > >>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > >>>>>> index d9c2ef8c1653..3076bef49e20 100644
> > >>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > >>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > >>>>>> @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type {
> > >>>>>> #define ARM_SMMU_SCTLR_M BIT(0)
> > >>>>>>
> > >>>>>> #define ARM_SMMU_CB_ACTLR 0x4
> > >>>>>> +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008
> > >>>>>> +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C
> > >>>>>>
> > >>>>>> #define ARM_SMMU_CB_RESUME 0x8
> > >>>>>> #define ARM_SMMU_RESUME_TERMINATE BIT(0)
> > >>>>>> diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h
> > >>>>>> index c637e0997f6d..d6e2ca9f8d8c 100644
> > >>>>>> --- a/include/linux/adreno-smmu-priv.h
> > >>>>>> +++ b/include/linux/adreno-smmu-priv.h
> > >>>>>> @@ -49,7 +49,10 @@ struct adreno_smmu_fault_info {
> > >>>>>> * before set_ttbr0_cfg(). If stalling on fault is enabled,
> > >>>>>> * the GPU driver must call resume_translation()
> > >>>>>> * @resume_translation: Resume translation after a fault
> > >>>>>> - *
> > >>>>>> + * @set_prr: Extendible interface to be used by GPU to modify the
> > >>>>>> + * ACTLR register bits, currently used to configure
> > >>>>>> + * Partially-Resident-Region (PRR) feature's
> > >>>>>> + * setup and reset sequence as requested.
> > >>>>>> *
> > >>>>>> * The GPU driver (drm/msm) and adreno-smmu work together for controlling
> > >>>>>> * the GPU's SMMU instance. This is by necessity, as the GPU is directly
> > >>>>>> @@ -67,6 +70,7 @@ struct adreno_smmu_priv {
> > >>>>>> void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info);
> > >>>>>> void (*set_stall)(const void *cookie, bool enabled);
> > >>>>>> void (*resume_translation)(const void *cookie, bool terminate);
> > >>>>>> + void (*set_prr)(const void *cookie, phys_addr_t page_addr, bool set);
> > >>>>>> };
> > >>>>>>
> > >>>>>> #endif /* __ADRENO_SMMU_PRIV_H */
> > >>>>>> --
> > >>>>>> 2.34.1
> > >>>>>>
Powered by blists - more mailing lists