linux-kernel - Re: [PATCH 0/9] iommu: Refactor flush queues into iommu-dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7f7daf42-8aff-b9ed-0f48-d4158896012e@huawei.com>
Date:   Wed, 24 Nov 2021 17:21:50 +0000
From:   John Garry <john.garry@...wei.com>
To:     Robin Murphy <robin.murphy@....com>, <joro@...tes.org>,
        <will@...nel.org>
CC:     <iommu@...ts.linux-foundation.org>,
        <suravee.suthikulpanit@....com>, <baolu.lu@...ux.intel.com>,
        <willy@...radead.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/9] iommu: Refactor flush queues into iommu-dma

On 23/11/2021 14:10, Robin Murphy wrote:
> As promised, this series cleans up the flush queue code and streamlines
> it directly into iommu-dma. Since we no longer have per-driver DMA ops
> implementations, a lot of the abstraction is now no longer necessary, so
> there's a nice degree of simplification in the process. Un-abstracting
> the queued page freeing mechanism is also the perfect opportunity to
> revise which struct page fields we use so we can be better-behaved
> from the MM point of view, thanks to Matthew.
> 
> These changes should also make it viable to start using the gather
> freelist in io-pgtable-arm, and eliminate some more synchronous
> invalidations from the normal flow there, but that is proving to need a
> bit more careful thought than I have time for in this cycle, so I've
> parked that again for now and will revisit it in the new year.
> 
> For convenience, branch at:
>    https://gitlab.arm.com/linux-arm/linux-rm/-/tree/iommu/iova
> 
> I've build-tested for x86_64, and boot-tested arm64 to the point of
> confirming that put_pages_list() gets passed a valid empty list when
> flushing, while everything else still works.
My interest is in patches 2, 3, 7, 8, 9, and they look ok. I did a bit 
of testing for strict and non-strict mode on my arm64 system and no 
problems.

Apart from this, I noticed that one possible optimization could be to 
avoid so many reads of fq_flush_finish_cnt, as we seem to have a pattern 
of fq_flush_iotlb()->atomic64_inc(fq_flush_finish_cnt) followed by a 
read of fq_flush_finish_cnt in fq_ring_free(), so we could use 
atomic64_inc_return(fq_flush_finish_cnt) and reuse the value. I think 
that any racing in fq_flush_finish_cnt accesses are latent, but maybe 
there is a flaw in this. However I tried something along these lines and 
got a 2.4% throughput gain for my storage scenario.

Thanks,
John