linux-kernel - Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8aadac80-da9b-b52a-a4bf-066406127117@amd.com>
Date:   Wed, 16 Jan 2019 07:28:13 +0000
From:   "Koenig, Christian" <Christian.Koenig@....com>
To:     Thomas Hellstrom <thellstrom@...are.com>, "hch@....de" <hch@....de>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "yong.zhi@...el.com" <yong.zhi@...el.com>,
        "daniel.vetter@...ll.ch" <daniel.vetter@...ll.ch>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
        "linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
        "bingbu.cao@...el.com" <bingbu.cao@...el.com>,
        "jian.xu.zheng@...el.com" <jian.xu.zheng@...el.com>,
        "tian.shu.qiu@...el.com" <tian.shu.qiu@...el.com>,
        "shiraz.saleem@...el.com" <shiraz.saleem@...el.com>,
        "sakari.ailus@...ux.intel.com" <sakari.ailus@...ux.intel.com>,
        "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
        "jgg@...pe.ca" <jgg@...pe.ca>
Subject: Re: [PATCH] lib/scatterlist: Provide a DMA page iterator

Am 16.01.19 um 08:09 schrieb Thomas Hellstrom:
> On Tue, 2019-01-15 at 21:58 +0100, hch@....de wrote:
>> On Tue, Jan 15, 2019 at 07:13:11PM +0000, Koenig, Christian wrote:
>>> Thomas is correct that the interface you propose here doesn't work
>>> at
>>> all for GPUs.
>>>
>>> The kernel driver is not informed of flush/sync, but rather just
>>> setups
>>> coherent mappings between system memory and devices.
>>>
>>> In other words you have an array of struct pages and need to map
>>> that to
>>> a specific device and so create dma_addresses for the mappings.
>> If you want a coherent mapping you need to use dma_alloc_coherent
>> and dma_mmap_coherent and you are done, that is not the problem.
>> That actually is one of the vmgfx modes, so I don't understand what
>> problem we are trying to solve if you don't actually want a non-
>> coherent mapping.
> For vmwgfx, not making dma_alloc_coherent default has a couple of
> reasons:
> 1) Memory is associated with a struct device. It has not been clear
> that it is exportable to other devices.
> 2) There seems to be restrictions in the system pages allowable. GPUs
> generally prefer highmem pages but dma_alloc_coherent returns a virtual
> address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
> prefers HIGHMEM pages to facilitate caching mode switching without
> having to touch the kernel map.
> 3) Historically we had APIs to allow coherent access to user-space
> defined pages. That has gone away not but the infrastructure was built
> around it.
>
> dma_mmap_coherent isn't use because as the data moves between system
> memory, swap and VRAM, PTEs of user-space mappings are adjusted
> accordingly, meaning user-space doesn't have to unmap when an operation
> is initiated that might mean the data is moved.

To summarize once more: We have an array of struct pages and want to 
coherently map that to a device.

If that is not possible because of whatever reason we want to get an 
error code or even not load the driver from the beginning.

>
>
>> Although last time I had that discussion with Daniel Vetter
>> I was under the impressions that GPUs really wanted non-coherent
>> mappings.
> Intel historically has done things a bit differently. And it's also
> possible that embedded platforms and ARM prefer this mode of operation,
> but I haven't caught up on that discussion.
>
>> But if you want a coherent mapping you can't go to a struct page,
>> because on many systems you can't just map arbitrary memory as
>> uncachable.  It might either come from very special limited pools,
>> or might need other magic applied to it so that it is not visible
>> in the normal direct mapping, or at least not access through it.
>
> The TTM subsystem has been relied on to provide coherent memory with
> the option to switch caching mode of pages. But only on selected and
> well tested platforms. On other platforms we simply do not load, and
> that's fine for now.
>
> But as mentioned multiple times, to make GPU drivers more compliant,
> we'd really want that
>
> bool dma_streaming_is_coherent(const struct device *)
>
> API to help us decide when to load or not.

Yes, please.

Christian.

>
> Thanks,
> Thomas
>
>
>
>
>
>
>