netdev - Re: [PATCH net] xsk: remove cheap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b8e1ef0d-20ae-0ea1-3c29-fc8db96e2afb@intel.com>
Date:   Wed, 1 Jul 2020 12:17:50 +0200
From:   Björn Töpel <bjorn.topel@...el.com>
To:     Robin Murphy <robin.murphy@....com>,
        Christoph Hellwig <hch@....de>,
        Daniel Borkmann <daniel@...earbox.net>
Cc:     maximmi@...lanox.com, konrad.wilk@...cle.com,
        jonathan.lemon@...il.com, linux-kernel@...r.kernel.org,
        iommu@...ts.linux-foundation.org, netdev@...r.kernel.org,
        bpf@...r.kernel.org, davem@...emloft.net, magnus.karlsson@...el.com
Subject: Re: [PATCH net] xsk: remove cheap_dma optimization

On 2020-06-29 17:41, Robin Murphy wrote:
> On 2020-06-28 18:16, Björn Töpel wrote:
[...]>
>> Somewhat related to the DMA API; It would have performance benefits for
>> AF_XDP if the DMA range of the mapped memory was linear, i.e. by IOMMU
>> utilization. I've started hacking a thing a little bit, but it would be
>> nice if such API was part of the mapping core.
>>
>> Input: array of pages Output: array of dma addrs (and obviously dev,
>> flags and such)
>>
>> For non-IOMMU len(array of pages) == len(array of dma addrs)
>> For best-case IOMMU len(array of dma addrs) == 1 (large linear space)
>>
>> But that's for later. :-)
> 
> FWIW you will typically get that behaviour from IOMMU-based 
> implementations of dma_map_sg() right now, although it's not strictly 
> guaranteed. If you can weather some additional setup cost of calling 
> sg_alloc_table_from_pages() plus walking the list after mapping to test 
> whether you did get a contiguous result, you could start taking 
> advantage of it as some of the dma-buf code in DRM and v4l2 does already 
> (although those cases actually treat it as a strict dependency rather 
> than an optimisation).
> 
> I'm inclined to agree that if we're going to see more of these cases, a 
> new API call that did formally guarantee a DMA-contiguous mapping 
> (either via IOMMU or bounce buffering) or failure might indeed be handy.
>

I forgot to reply to this one! My current hack is using the iommu code 
directly, similar to what vfio-pci does (hopefully not gutting the API 
this time ;-)).

Your approach sound much nicer, and easier. I'll try that out! Thanks a 
lot for the pointers, and I might be back with more questions.


Cheers,
Björn

> Robin.