[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8212078bd56c54ce508205eae0ed0b69e78d4c38.camel@pengutronix.de>
Date: Wed, 21 Jun 2023 17:58:43 +0200
From: Lucas Stach <l.stach@...gutronix.de>
To: Sui Jingfeng <suijingfeng@...ngson.cn>,
Sui Jingfeng <18949883232@....com>,
Russell King <linux+etnaviv@...linux.org.uk>,
Christian Gmeiner <christian.gmeiner@...il.com>,
David Airlie <airlied@...il.com>,
Daniel Vetter <daniel@...ll.ch>
Cc: linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
etnaviv@...ts.freedesktop.org,
Philipp Zabel <p.zabel@...gutronix.de>,
Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent
device
Am Mittwoch, dem 21.06.2023 um 23:30 +0800 schrieb Sui Jingfeng:
> Hi,
>
> On 2023/6/21 18:00, Lucas Stach wrote:
> > > dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt,
> > > etnaviv_op_to_dma_dir(op));
> > > etnaviv_obj->last_cpu_prep_op = op;
> > > @@ -408,8 +421,9 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
> > > {
> > > struct drm_device *dev = obj->dev;
> > > struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
> > > + struct etnaviv_drm_private *priv = dev->dev_private;
> > >
> > > - if (etnaviv_obj->flags & ETNA_BO_CACHED) {
> > > + if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
> > > /* fini without a prep is almost certainly a userspace error */
> > > WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
> > > dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > index 3524b5811682..754126992264 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > @@ -112,11 +112,16 @@ static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = {
> > > struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev,
> > > struct dma_buf_attachment *attach, struct sg_table *sgt)
> > > {
> > > + struct etnaviv_drm_private *priv = dev->dev_private;
> > > struct etnaviv_gem_object *etnaviv_obj;
> > > size_t size = PAGE_ALIGN(attach->dmabuf->size);
> > > + u32 cache_flags = ETNA_BO_WC;
> > > int ret, npages;
> > >
> > > - ret = etnaviv_gem_new_private(dev, size, ETNA_BO_WC,
> > > + if (priv->dma_coherent)
> > > + cache_flags = ETNA_BO_CACHED;
> > > +
> > Drop this change. Instead etnaviv_gem_new_impl() should do the upgrade
> > from WC to CACHED as necessary by adding something like this:
>
> I understand you are a profession person in vivante GPU driver domain.
>
> I respect you reviews and instruction.
>
> But, I'm really reluctant to agree with this, is there any space to
> negotiate?
>
> > /*
> > * Upgrade WC to CACHED when the device is hardware coherent and the
> > * platform doesn't allow mixing cached and writecombined mappings to
> > * the same memory area.
> > */
> > if ((flags & ETNA_BO_CACHE_MASK) == ETNA_BO_WC &&
> > dev_is_dma_coherent(dev) && !drm_arch_can_wc_memory())
> > flags = (flags & ~ETNA_BO_CACHE_MASK) & ETNA_BO_CACHED;
>
> This is policy, not a mechanism.
>
> Using what cache property is a user-space program's choice.
>
> While you are override the WC with CACHED mapping. This is not correct
> in the concept!
>
Please explain why you think that this isn't correct. If using WC
mappings cause a potential loss of coherency on your platform, then we
can not allow the userspace driver to use WC mappings.
As I would like to keep the option of WC mappings, I've asked you if
there are ways to prepare the cache in a way that WC mappings aren't
causing any troubles on your platform. You told me that this might be
possible but needs confirmation from a HW engineer and such
confirmation could take a long time.
With that in mind, our only option right now is to upgrade the mappings
to cached in order to not lay out traps for the userspace driver.
> you approach forbidden any possibility to use the WC BO at anywhere.
>
>
> My approach need only check once, while you approach need at least 3
> check plus
>
> so much bit-wise logic operations, plus a function call (&, ==, &&,
> &, ~, &) .
>
> and every time you create a BO. This nasty judgement happens.
>
BO creation again is not a fast path. You are committing to allocate
new memory, which is a few orders of magnitude more costly than the few
instructions needed for those comparisons.
>
> Please keep our original implement, it's simple and clear, Please?
>
It isn't as simple and clear for the userspace interface. It allows
userspace to use WC mappings that would potentially cause loss of
coherency between CPU and GPU, which isn't acceptable.
Regards,
Lucas
Powered by blists - more mailing lists