lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFCwf12JXQ6XnQEPM6wa2ut8dV8VBLTJE_popZT2GTVVra5CLQ@mail.gmail.com>
Date:   Wed, 23 Jun 2021 12:14:59 +0300
From:   Oded Gabbay <oded.gabbay@...il.com>
To:     Christian König <ckoenig.leichtzumerken@...il.com>
Cc:     Jason Gunthorpe <jgg@...pe.ca>,
        Christian König <christian.koenig@....com>,
        Gal Pressman <galpress@...zon.com>, sleybo@...zon.com,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Oded Gabbay <ogabbay@...nel.org>,
        Christoph Hellwig <hch@....de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        "moderated list:DMA BUFFER SHARING FRAMEWORK" 
        <linaro-mm-sig@...ts.linaro.org>,
        Doug Ledford <dledford@...hat.com>,
        Tomer Tayar <ttayar@...ana.ai>,
        amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        Alex Deucher <alexander.deucher@....com>,
        Leon Romanovsky <leonro@...dia.com>,
        "open list:DMA BUFFER SHARING FRAMEWORK" 
        <linux-media@...r.kernel.org>
Subject: Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export
 FD for DMA-BUF

On Wed, Jun 23, 2021 at 11:57 AM Christian König
<ckoenig.leichtzumerken@...il.com> wrote:
>
> Am 22.06.21 um 18:05 schrieb Jason Gunthorpe:
> > On Tue, Jun 22, 2021 at 05:48:10PM +0200, Christian König wrote:
> >> Am 22.06.21 um 17:40 schrieb Jason Gunthorpe:
> >>> On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote:
> >>>> [SNIP]
> >>>> No absolutely not. NVidia GPUs work exactly the same way.
> >>>>
> >>>> And you have tons of similar cases in embedded and SoC systems where
> >>>> intermediate memory between devices isn't directly addressable with the CPU.
> >>> None of that is PCI P2P.
> >>>
> >>> It is all some specialty direct transfer.
> >>>
> >>> You can't reasonably call dma_map_resource() on non CPU mapped memory
> >>> for instance, what address would you pass?
> >>>
> >>> Do not confuse "I am doing transfers between two HW blocks" with PCI
> >>> Peer to Peer DMA transfers - the latter is a very narrow subcase.
> >>>
> >>>> No, just using the dma_map_resource() interface.
> >>> Ik, but yes that does "work". Logan's series is better.
> >> No it isn't. It makes devices depend on allocating struct pages for their
> >> BARs which is not necessary nor desired.
> > Which dramatically reduces the cost of establishing DMA mappings, a
> > loop of dma_map_resource() is very expensive.
>
> Yeah, but that is perfectly ok. Our BAR allocations are either in chunks
> of at least 2MiB or only a single 4KiB page.
>
> Oded might run into more performance problems, but those DMA-buf
> mappings are usually set up only once.
>
> >> How do you prevent direct I/O on those pages for example?
> > GUP fails.
>
> At least that is calming.
>
> >> Allocating a struct pages has their use case, for example for exposing VRAM
> >> as memory for HMM. But that is something very specific and should not limit
> >> PCIe P2P DMA in general.
> > Sure, but that is an ideal we are far from obtaining, and nobody wants
> > to work on it prefering to do hacky hacky like this.
> >
> > If you believe in this then remove the scatter list from dmabuf, add a
> > new set of dma_map* APIs to work on physical addresses and all the
> > other stuff needed.
>
> Yeah, that's what I totally agree on. And I actually hoped that the new
> P2P work for PCIe would go into that direction, but that didn't
> materialized.
>
> But allocating struct pages for PCIe BARs which are essentially
> registers and not memory is much more hacky than the dma_resource_map()
> approach.
>
> To re-iterate why I think that having struct pages for those BARs is a
> bad idea: Our doorbells on AMD GPUs are write and read pointers for ring
> buffers.
>
> When you write to the BAR you essentially tell the firmware that you
> have either filled the ring buffer or read a bunch of it. This in turn
> then triggers an interrupt in the hardware/firmware which was eventually
> asleep.
>
> By using PCIe P2P we want to avoid the round trip to the CPU when one
> device has filled the ring buffer and another device must be woken up to
> process it.
>
> Think of it as MSI-X in reverse and allocating struct pages for those
> BARs just to work around the shortcomings of the DMA API makes no sense
> at all to me.
We would also like to do that *in the future*.
In Gaudi it will never be supported (due to security limitations) but
I definitely see it happening in future ASICs.

Oded

>
>
> We also do have the VRAM BAR, and for HMM we do allocate struct pages
> for the address range exposed there. But this is a different use case.
>
> Regards,
> Christian.
>
> >
> > Otherwise, we have what we have and drivers don't get to opt out. This
> > is why the stuff in AMDGPU was NAK'd.
> >
> > Jason
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ