lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b3059196-a28b-a509-fc0e-75d2dbebdbae@linux.intel.com>
Date:   Thu, 28 Apr 2022 09:02:25 -0700
From:   Andi Kleen <ak@...ux.intel.com>
To:     Robin Murphy <robin.murphy@....com>,
        Christoph Hellwig <hch@...radead.org>
Cc:     Tianyu Lan <ltykernel@...il.com>, m.szyprowski@...sung.com,
        michael.h.kelley@...rosoft.com, kys@...rosoft.com,
        parri.andrea@...il.com, thomas.lendacky@....com,
        wei.liu@...nel.org, Tianyu Lan <Tianyu.Lan@...rosoft.com>,
        linux-hyperv@...r.kernel.org, konrad.wilk@...cle.com,
        linux-kernel@...r.kernel.org, kirill.shutemov@...el.com,
        iommu@...ts.linux-foundation.org, andi.kleen@...el.com,
        brijesh.singh@....com, vkuznets@...hat.com, hch@....de
Subject: Re: [RFC PATCH 1/2] swiotlb: Split up single swiotlb lock


On 4/28/2022 8:07 AM, Robin Murphy wrote:
> On 2022-04-28 15:55, Andi Kleen wrote:
>>
>> On 4/28/2022 7:45 AM, Christoph Hellwig wrote:
>>> On Thu, Apr 28, 2022 at 03:44:36PM +0100, Robin Murphy wrote:
>>>> Rather than introduce this extra level of allocator complexity, how 
>>>> about
>>>> just dividing up the initial SWIOTLB allocation into multiple 
>>>> io_tlb_mem
>>>> instances?
>>> Yeah.  We're almost done removing all knowledge of swiotlb from 
>>> drivers,
>>> so the very last thing I want is an interface that allows a driver to
>>> allocate a per-device buffer.
>>
>> At least for TDX need parallelism with a single device for performance.
>>
>> So if you split up the io tlb mems for a device then you would need a 
>> new mechanism to load balance the requests for single device over 
>> those. I doubt it would be any simpler.
>
> Eh, I think it would be, since the round-robin retry loop can then 
> just sit around the existing io_tlb_mem-based allocator, vs. the churn 
> of inserting it in the middle, plus it's then really easy to 
> statically distribute different starting points across different 
> devices via dev->dma_io_tlb_mem if we wanted to.
>
> Admittedly the overall patch probably ends up about the same size, 
> since it likely pushes a bit more complexity into swiotlb_init to 
> compensate, but that's still a trade-off I like.

Unless you completely break the external API this will require a new 
mechanism to search a list of io_tlb_mems for the right area to free into.

If the memory area not contiguous (like in the original patch) this will 
be a O(n) operation on the number of io_tlb_mems, so it would get more 
and more expensive on larger systems. Or you merge them all together (so 
that the simple address arithmetic to look up the area works again), 
which will require even more changes in the setup. Or you add hashing or 
similar which will be even more complicated.

In the end doing it with a single io_tlb_mem is significantly simpler 
and also more natural.

-Andi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ