linux-kernel - Re: [PATCHv2 8/8] videobuf2: handle non-contiguous DMA allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210622073308.GA32231@lst.de>
Date:   Tue, 22 Jun 2021 09:33:08 +0200
From:   Christoph Hellwig <hch@....de>
To:     Tomasz Figa <tfiga@...omium.org>
Cc:     Christoph Hellwig <hch@....de>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Hans Verkuil <hverkuil-cisco@...all.nl>,
        Ricardo Ribalda <ribalda@...omium.org>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Linux Media Mailing List <linux-media@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCHv2 8/8] videobuf2: handle non-contiguous DMA allocations

On Fri, Jun 18, 2021 at 01:44:08PM +0900, Tomasz Figa wrote:
> > Well, dma_alloc_coherent users want a non-cached mapping.  And while
> > some architectures provide that using a vmap with "uncached" bits in the
> > PTE to provide that, this:
> >
> >  a) is not possibly everywhere
> >  b) even where possible is not always the best idea as it creates mappings
> >     with differnet cachability bets
> 
> I think this could be addressed by having a dma_vmap() helper that
> does the right thing, whether it's vmap() or dma_common_pages_remap()
> as appropriate. Or would be this still insufficient for some
> architectures?

It can't always do the right thing.  E.g. for the case where uncached
memory needs to be allocated from a special boot time fixed pool.

> > And even without that dma_alloc_noncoherent causes less overhead than
> > dma_alloc_noncontigious if you only need a single contiguous range.
> >
> 
> Given that behind the scenes dma_alloc_noncontiguous() would also just
> call __dma_alloc_pages() for devices that need contiguous pages, would
> the overhead be basically the creation of a single-entry sgtable?

In the best case: yes.

> > So while I'm happy we have something useful for more complex drivers like
> > v4l I think the simple dma_alloc_coherent API, including some of the less
> > crazy flags for dma_alloc_attrs is the right thing to use for more than
> > 90% of the use cases.
> 
> One thing to take into account here is that many drivers use the
> existing "simple" way, just because there wasn't a viable alternative
> to do something better. Agreed, though, that we shouldn't optimize for
> the rare cases.

While that might be true for a few drivers, it is absolutely not true
for the wide majority.  I think you media people are a little special,
with only the GPU folks contending for "specialness" :)  (although
media handles it way better, gpu folks just create local hacks that
can't work portably).