lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF6AEGsNSH4ZccWFN5F3o1uaFUqjpNQROON0QM2n6QnDT0Lq9A@mail.gmail.com>
Date: Wed, 5 Jun 2024 15:13:16 -0700
From: Rob Clark <robdclark@...il.com>
To: Bibek Kumar Patro <quic_bibekkum@...cinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>, Konrad Dybcio <konrad.dybcio@...aro.org>, 
	will@...nel.org, robin.murphy@....com, joro@...tes.org, jsnitsel@...hat.com, 
	quic_bjorande@...cinc.com, mani@...nel.org, quic_eberman@...cinc.com, 
	robdclark@...omium.org, u.kleine-koenig@...gutronix.de, robh@...nel.org, 
	vladimir.oltean@....com, quic_pkondeti@...cinc.com, quic_molvera@...cinc.com, 
	linux-arm-msm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 3/5] iommu/arm-smmu: introduction of ACTLR for custom
 prefetcher settings

On Wed, Jun 5, 2024 at 3:52 AM Bibek Kumar Patro
<quic_bibekkum@...cinc.com> wrote:
>
> On 6/5/2024 12:19 AM, Rob Clark wrote:
> > On Thu, May 30, 2024 at 2:22 AM Bibek Kumar Patro
> > <quic_bibekkum@...cinc.com> wrote:
> >>
> >>
> >>
> >> On 5/28/2024 9:38 PM, Rob Clark wrote:
> >>> On Tue, May 28, 2024 at 6:06 AM Dmitry Baryshkov
> >>> <dmitry.baryshkov@...aro.org> wrote:
> >>>>
> >>>> On Tue, May 28, 2024 at 02:59:51PM +0200, Konrad Dybcio wrote:
> >>>>>
> >>>>>
> >>>>> On 5/15/24 15:59, Bibek Kumar Patro wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 5/10/2024 6:32 PM, Konrad Dybcio wrote:
> >>>>>>> On 10.05.2024 2:52 PM, Bibek Kumar Patro wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 5/1/2024 12:30 AM, Rob Clark wrote:
> >>>>>>>>> On Tue, Jan 23, 2024 at 7:00 AM Bibek Kumar Patro
> >>>>>>>>> <quic_bibekkum@...cinc.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Currently in Qualcomm  SoCs the default prefetch is set to 1 which allows
> >>>>>>>>>> the TLB to fetch just the next page table. MMU-500 features ACTLR
> >>>>>>>>>> register which is implementation defined and is used for Qualcomm SoCs
> >>>>>>>>>> to have a custom prefetch setting enabling TLB to prefetch the next set
> >>>>>>>>>> of page tables accordingly allowing for faster translations.
> >>>>>>>>>>
> >>>>>>>>>> ACTLR value is unique for each SMR (Stream matching register) and stored
> >>>>>>>>>> in a pre-populated table. This value is set to the register during
> >>>>>>>>>> context bank initialisation.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@...cinc.com>
> >>>>>>>>>> ---
> >>>>>>>
> >>>>>>> [...]
> >>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>> +               for_each_cfg_sme(cfg, fwspec, j, idx) {
> >>>>>>>>>> +                       smr = &smmu->smrs[idx];
> >>>>>>>>>> +                       if (smr_is_subset(smr, id, mask)) {
> >>>>>>>>>> +                               arm_smmu_cb_write(smmu, cbndx, ARM_SMMU_CB_ACTLR,
> >>>>>>>>>> +                                               actlrcfg[i].actlr);
> >>>>>>>>>
> >>>>>>>>> So, this makes ACTLR look like kind of a FIFO.  But I'm looking at
> >>>>>>>>> downstream kgsl's PRR thing (which we'll need to implement vulkan
> >>>>>>>>> sparse residency), and it appears to be wanting to set BIT(5) in ACTLR
> >>>>>>>>> to enable PRR.
> >>>>>>>>>
> >>>>>>>>>             val = KGSL_IOMMU_GET_CTX_REG(ctx, KGSL_IOMMU_CTX_ACTLR);
> >>>>>>>>>             val |= FIELD_PREP(KGSL_IOMMU_ACTLR_PRR_ENABLE, 1);
> >>>>>>>>>             KGSL_IOMMU_SET_CTX_REG(ctx, KGSL_IOMMU_CTX_ACTLR, val);
> >>>>>>>>>
> >>>>>>>>> Any idea how this works?  And does it need to be done before or after
> >>>>>>>>> the ACTLR programming done in this patch?
> >>>>>>>>>
> >>>>>>>>> BR,
> >>>>>>>>> -R
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Hi Rob,
> >>>>>>>>
> >>>>>>>> Can you please help provide some more clarification on the FIFO part? By FIFO are you referring to the storing of ACTLR data in the table?
> >>>>>>>>
> >>>>>>>> Thanks for pointing to the downstream implementation of kgsl driver for
> >>>>>>>> the PRR bit. Since kgsl driver is already handling this PRR bit's
> >>>>>>>> setting, this makes setting the PRR BIT(5) by SMMU driver redundant.
> >>>>>>>
> >>>>>>> The kgsl driver is not present upstream.
> >>>>>>>
> >>>>>>
> >>>>>> Right kgsl is not present upstream, it would be better to avoid configuring the PRR bit and can be handled by kgsl directly in downstream.
> >>>>>
> >>>>> No! Upstream is not a dumping ground to reduce your technical debt.
> >>>>>
> >>>>> There is no kgsl driver upstream, so this ought to be handled here, in
> >>>>> the iommu driver (as poking at hardware A from driver B is usually not good
> >>>>> practice).
> >>>>
> >>>> I'd second the request here. If another driver has to control the
> >>>> behaviour of another driver, please add corresponding API for that.
> >>>
> >>> We have adreno_smmu_priv for this purpose ;-)
> >>>
> >>
> >> Thanks Rob for pointing to this private interface structure between smmu
> >> and gpu. I think it's similar to what you're trying to implement here
> >> https://lore.kernel.org/all/CAF6AEGtm-KweFdMFvahH1pWmpOq7dW_p0Xe_13aHGWt0jSbg8w@mail.gmail.com/#t
> >> I can add an api "set_actlr_prr()" with smmu_domain cookie, page pointer
> >> as two parameters. This api then can be used by drm/msm driver to carry
> >> out the prr implementation by simply calling this.
> >> Would this be okay Rob,Konrad,Dmitry?
> >> Let me know if any other suggestions you have in mind as well regarding
> >> parameters and placement.
> >
> > Hey Bibek, quick question.. is ACTLR preserved across a suspend/resume
> > cycle?  Or does it need to be reprogrammed on resume?  And same
> > question for these two PRR related regs:
> >
> >    /* Global SMMU register offsets */
> >    #define KGSL_IOMMU_PRR_CFG_LADDR        0x6008
> >    #define KGSL_IOMMU_PRR_CFG_UADDR        0x600c
> >
> > (ie. high/low 32b of the PRR page)
> >
>
> Hey Rob, In suspend/resume, the register space power rails are not in
> disabled state, so it won't go back to reset values and should retain
> it's value. Only in hibernation cycle the registers' value would get reset.
>
> So the hi/low address bit register for PRR page would also retain it's
> value along with the ACTLR registers.
>
> > I was starting to type up a patch to add PRR configuration, but
> > depending on whether it interacts with suspend/resume, it might be
> > better form arm-smmu-qcom.c to just always enable and configure PRR
> > (including allocating a page to have an address to program into
> > PRR_CFG_LADDR/UADDR), and instead add an interface to return the PRR
> > page?  I think there is no harm in unconditionally configuring PRR for
> > gpu smmu.
>
> Sounds okay though since this would not interact with suspend/resume path.
> But I think, suppose in-case this page would have some other references
> as well before configuring the address to the registers for PRR
> configuration, then  GPU would be dependent on arm-smmu-qcom for this page.
> So Instead an endpoint api in arm-smmu-qcom.c can recieve the just the
> page-address, and bit set status from drm/msm driver and can set/reset
> the bit along with any page-address they want ?
> It would mean the interface will be smmu's , but the choice of
> configuration data to the registers' will be still with gpu.
>
> I wrote up a small patch with this implementation, would you like to
> review that?
> Will send it in this v11 series as new patch.

I think if there is no suspend/resume interaction, we should go back
to the original idea of page allocation in drm/msm.

Basically, I think the pros and cons are:

allocate in arm-smmu
  pro: easy to sequence programming with suspend/resume
  con: there isn't a convenient place to free the page on driver unload

allocate in drm/msm:
  pro: easy place to free the page in teardown
  con: harder to sequence with s/r

But if ACTLR and PRR_CFG_LADDR/UADDR are retained, then the con isn't
actually an issue ;-)

Anyways, I can type that patch..  the rest of drm/msm and userspace
changes (vm_bind + sparse) to get to the point where I can use PRR are
a somewhat bigger task so it will take me a while to get the point
where I can test any smmu patches.

BR,
-R


> Thanks & regards,
> Bibek
>
> >
> > BR,
> > -R
> >
> >> Thanks & regards,
> >> Bibek
> >>
> >>> BR,
> >>> -R
> >>>
> >>>>>
> >>>>>>
> >>>>>>>> Thanks for bringing up this point.
> >>>>>>>> I will send v10 patch series removing this BIT(5) setting from the ACTLR
> >>>>>>>> table.
> >>>>>>>
> >>>>>>> I think it's generally saner to configure the SMMU from the SMMU driver..
> >>>>>>
> >>>>>> Yes, agree on this. But since PRR bit is not directly related to SMMU
> >>>>>> configuration so I think it would be better to remove this PRR bit
> >>>>>> setting from SMMU driver based on my understanding.
> >>>>>
> >>>>> Why is it not related? We still don't know what it does.
> >>>>>
> >>>>> Konrad
> >>>>
> >>>> --
> >>>> With best wishes
> >>>> Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ