lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Oct 2018 10:51:41 -0700
From:   Laura Abbott <labbott@...hat.com>
To:     John Stultz <john.stultz@...aro.org>,
        lkml <linux-kernel@...r.kernel.org>
Cc:     Beata Michalska <Beata.Michalska@....com>,
        Matt Szczesiak <matt.szczesiak@....com>,
        Anders Pedersen <Anders.Pedersen@....com>,
        John Reitan <John.Reitan@....com>,
        Liam Mark <lmark@...eaurora.org>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Todd Kjos <tkjos@...roid.com>,
        Martijn Coenen <maco@...roid.com>,
        dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH] staging: ion: Rework ion_map_dma_buf() to minimize
 re-mapping

On 10/10/2018 04:33 PM, John Stultz wrote:
> Since 4.12, much later narrowed down to commit 2a55e7b5e544
> ("staging: android: ion: Call dma_map_sg for syncing and mapping"),
> we have seen graphics performance issues on the HiKey960.
> 
> This was initially confounded by the fact that the out-of-tree
> DRM driver was using HiSi custom ION heap which broke with the
> 4.12 ION abi changes, so there was lots of suspicion that the
> performance problems were due to switching to a somewhat simple
> cma based DRM driver for HiKey960. Additionally, as no
> performance regression was seen w/ the original HiKey board
> (which is SMP, not big.LITTLE as w/ HiKey960), there was some
> thought that the out-of-tree EAS code wasn't quite optimized.
> 
> But after chasing a number of other leads, I found that
> reverting the ION code to 4.11-era got the majority of the
> graphics performance back (there may yet be further EAS tweaks
> needed), which lead me to the dma_map_sg change.
> 
> In talking w/ Laura and Liam, it was suspected that the extra
> cache operations were causing the trouble. Additionally, I found
> that part of the reason we didn't see this w/ the original
> HiKey board is that its (proprietary blob) GL code uses ion_mmap
> and ion_map_dma_buf is called very rarely, where as with
> HiKey960, the (also proprietary blob) GL code calls
> ion_map_dma_buf much more frequently via the kernel driver.
> 
> Anyway, with the cause of the performance regression isolated,
> I've tried to find a way to improve the performance of the
> current code.
> 
> This approach, which I've mostly copied from the drm_prime
> implementation is to try to track the direction we're mapping
> the buffers so we can avoid calling dma_map/unmap_sg on every
> ion_map_dma_buf/ion_unmap_dma_buf call, and instead try to do
> the work in attach/detach paths.
> 
> I'm not 100% sure of the correctness here, so close review would
> be good, but it gets the performance back to being similar to
> reverting the ION code to the 4.11-era.
> 
> Feedback would be greatly appreciated!
> 
> Cc: Beata Michalska <Beata.Michalska@....com>
> Cc: Matt Szczesiak <matt.szczesiak@....com>
> Cc: Anders Pedersen <Anders.Pedersen@....com>
> Cc: John Reitan <John.Reitan@....com>
> Cc: Liam Mark <lmark@...eaurora.org>
> Cc: Laura Abbott <labbott@...hat.com>
> Cc: Sumit Semwal <sumit.semwal@...aro.org>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Cc: Todd Kjos <tkjos@...roid.com>
> Cc: Martijn Coenen <maco@...roid.com>
> Cc: dri-devel@...ts.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@...aro.org>
> ---
>   drivers/staging/android/ion/ion.c | 36 +++++++++++++++++++++++++++++++-----
>   1 file changed, 31 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index 9907332..a4d7fca 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -199,6 +199,7 @@ struct ion_dma_buf_attachment {
>   	struct device *dev;
>   	struct sg_table *table;
>   	struct list_head list;
> +	enum dma_data_direction dir;
>   };
>   
>   static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> @@ -220,6 +221,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
>   
>   	a->table = table;
>   	a->dev = attachment->dev;
> +	a->dir = DMA_NONE;
>   	INIT_LIST_HEAD(&a->list);
>   
>   	attachment->priv = a;
> @@ -236,6 +238,18 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf,
>   {
>   	struct ion_dma_buf_attachment *a = attachment->priv;
>   	struct ion_buffer *buffer = dmabuf->priv;
> +	struct sg_table *table;
> +
> +	if (!a)
> +		return;
> +
> +	table = a->table;
> +	if (table) {
> +		if (a->dir != DMA_NONE)
> +			dma_unmap_sg(attachment->dev, table->sgl, table->nents,
> +				     a->dir);
> +		sg_free_table(table);
> +	}
>   
>   	free_duped_table(a->table);
>   	mutex_lock(&buffer->lock);
> @@ -243,6 +257,7 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf,
>   	mutex_unlock(&buffer->lock);
>   
>   	kfree(a);
> +	attachment->priv = NULL;
>   }
>   
>   static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> @@ -251,12 +266,24 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
>   	struct ion_dma_buf_attachment *a = attachment->priv;
>   	struct sg_table *table;
>   
> -	table = a->table;
> +	if (WARN_ON(direction == DMA_NONE || !a))
> +		return ERR_PTR(-EINVAL);
>   
> -	if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> -			direction))
> -		return ERR_PTR(-ENOMEM);
> +	if (a->dir == direction)
> +		return a->table;
>   
> +	if (WARN_ON(a->dir != DMA_NONE))
> +		return ERR_PTR(-EBUSY);
> +
> +	table = a->table;
> +	if (!IS_ERR(table)) {
> +		if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> +				direction)) {
> +			table = ERR_PTR(-ENOMEM);
> +		} else {
> +			a->dir = direction;
> +		}
> +	}
>   	return table;
>   }
>   
> @@ -264,7 +291,6 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
>   			      struct sg_table *table,
>   			      enum dma_data_direction direction)
>   {
> -	dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);

This changes the semantics so that the only time a buffer
gets unmapped is on detach. I don't think we want to restrict
Ion to that behavior but I also don't know if anyone else
is relying on that. I thought there might have been some Qualcomm
stuff that did that (Liam? Todd?)

I suspect most of the cost of the dma_map/dma_unmap is from the
cache flushing and not the actual mapping operations. If this
is the case, another option might be to figure out how to
incorporate dma_attrs so drivers can use DMA_ATTR_SKIP_CPU_SYNC
to decide when they actually want to sync.

Thanks,
Laura

>   }
>   
>   static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ