lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAFQd5CcN+TiVd8vhMxQRbmrJuBGYwL5d6C0fKzOy4ujjM_JMQ@mail.gmail.com>
Date:   Wed, 20 Sep 2023 16:41:08 +0900
From:   Tomasz Figa <tfiga@...omium.org>
To:     Fang Hui <hui.fang@....com>, Christoph Hellwig <hch@....de>,
        Robin Murphy <robin.murphy@....com>
Cc:     m.szyprowski@...sung.com, mchehab@...nel.org,
        linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
        anle.pan@....com, xuegang.liu@....com
Subject: Re: [PATCH] MA-21654 Use dma_alloc_pages in vb2_dma_sg_alloc_compacted

Hi Fang,

On Thu, Sep 14, 2023 at 4:41 PM Fang Hui <hui.fang@....com> wrote:
>
> On system with "CONFIG_ZONE_DMA32=y", if the allocated physical address is

First of all, thanks a lot for the patch! Please check my review comments below.

Is CONFIG_ZONE_DMA32 really the factor that triggers the problem? My
understanding was that the problem was that the hardware has 32-bit
DMA, but the system has physical memory at addresses beyond the first
4G.

> greater than 4G, swiotlb will be used. It will lead below defects.
> 1) Impact performance due to an extra memcpy.
> 2) May meet below error due to swiotlb_max_mapping_size()
>    is 256K (IO_TLB_SIZE * IO_TLB_SEGSIZE).
> "swiotlb buffer is full (sz: 393216 bytes), total 65536 (slots),
> used 2358 (slots)"
>
> To avoid those defects, use dma_alloc_pages() instead of alloc_pages()
> in vb2_dma_sg_alloc_compacted().
>
> Suggested-by: Tomasz Figa <tfiga@...omium.org>
> Signed-off-by: Fang Hui <hui.fang@....com>
> ---
>  drivers/media/common/videobuf2/videobuf2-dma-sg.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>

Please remove MA-21654 from the subject and prefix it with the right
tags for the path (`git log drivers/media/common/videobuf2` should be
helpful to find the right one).

> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> index 28f3fdfe23a2..b938582c68f4 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> @@ -58,7 +58,7 @@ struct vb2_dma_sg_buf {
>  static void vb2_dma_sg_put(void *buf_priv);
>
>  static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
> -               gfp_t gfp_flags)
> +               gfp_t gfp_flags, struct device *dev)

FWIW buf->dev already points to the right device - although we would
need to move the assignment in vb2_dma_sg_alloc() to a place higher in
that function before calling this function.

>  {
>         unsigned int last_page = 0;
>         unsigned long size = buf->size;
> @@ -67,6 +67,7 @@ static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
>                 struct page *pages;
>                 int order;
>                 int i;
> +               dma_addr_t dma_handle;
>
>                 order = get_order(size);
>                 /* Don't over allocate*/
> @@ -75,8 +76,9 @@ static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
>
>                 pages = NULL;
>                 while (!pages) {
> -                       pages = alloc_pages(GFP_KERNEL | __GFP_ZERO |
> -                                       __GFP_NOWARN | gfp_flags, order);
> +                       pages = dma_alloc_pages(dev, PAGE_SIZE << order, &dma_handle,

Hmm, when I was proposing dma_alloc_pages(), I missed that it returns
a DMA handle. That on its own can be handled by saving the returned
handles somewhere in struct vb2_dma_sg_buf, but there is a bigger
problem - the function would actually create a mapping if the DMA
device requires some mapping management (e.g. is behind an IOMMU),
which is undesirable, because we create the mapping ourselves below
anyway...

@Christoph Hellwig @Robin Murphy  I need your thoughts on this as
well. Would it make sense to have a variant of dma_alloc_pages() that
only allocates the pages, but doesn't perform the mapping? (Or a flag
that tells the implementation to skip creating a mapping.)

> +                               DMA_BIDIRECTIONAL,

The right value should be already available in buf->dma_dir.

> +                               GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | gfp_flags);
>                         if (pages)
>                                 break;
>
> @@ -96,6 +98,7 @@ static int vb2_dma_sg_alloc_compacted(struct vb2_dma_sg_buf *buf,
>         }
>
>         return 0;
> +

Unnecessary blank line.

>  }
>
>  static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
> @@ -130,7 +133,7 @@ static void *vb2_dma_sg_alloc(struct vb2_buffer *vb, struct device *dev,
>         if (!buf->pages)
>                 goto fail_pages_array_alloc;
>
> -       ret = vb2_dma_sg_alloc_compacted(buf, vb->vb2_queue->gfp_flags);
> +       ret = vb2_dma_sg_alloc_compacted(buf, vb->vb2_queue->gfp_flags, dev);
>         if (ret)
>                 goto fail_pages_alloc;
>
> --
> 2.17.1
>

We also need to use dma_free_pages() to free the memory.

Best regards,
Tomasz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ