linux-kernel - Re: [PATCH 02/13] iommu: Introduce Interface for IOMMU TLB Flushing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170817165040.GK2853@suse.de>
Date:   Thu, 17 Aug 2017 18:50:40 +0200
From:   Joerg Roedel <jroedel@...e.de>
To:     Will Deacon <will.deacon@....com>
Cc:     Joerg Roedel <joro@...tes.org>, iommu@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org,
        Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH 02/13] iommu: Introduce Interface for IOMMU TLB Flushing

Hi Will,

On Thu, Aug 17, 2017 at 05:32:35PM +0100, Will Deacon wrote:
> I really like the idea of this, but I have a couple of questions and
> comments below.

Great, this together with the common iova-flush it should make it
possible to solve the performance problems of the dma-iommu code.

> > +int iommu_map_sync(struct iommu_domain *domain, unsigned long iova,
> > +		   phys_addr_t paddr, size_t size, int prot)
> > +{
> > +	int ret = iommu_map(domain, iova, paddr, size, prot);
> > +
> > +	iommu_tlb_range_add(domain, iova, size);
> > +	iommu_tlb_sync(domain);
> 
> Many IOMMUs don't need these callbacks on ->map operations, but they won't
> be able to distinguish them easily with this API. Could you add a flags
> parameter or something to the iommu_tlb_* functions, please?

Yeah, this is only needed for virtualized IOMMUs that have a non-present
cache. My idea was to let the iommu-drivers tell the common code whether
the iommu needs it and the code above just checks a flag and omits the
calls to the flush-functions then.

Problem currently is how to get this information from
'struct iommu_device' to 'struct iommu_domain'. As a workaround I
consider a per-domain flag in the iommu drivers which checks whether any
unmap has happened and just do nothing on the flush-call-back if there
were none.

> I think we will struggle to implement this efficiently on ARM SMMUv3. The
> way invalidation works there is that there is a single in-memory command
> queue into which we can put TLB invalidation commands (they are inserted
> under a lock). These are then processed asynchronously by the hardware, and
> you can complete them by inserting a SYNC command and waiting for that to
> be consumed by the SMMU. Sounds like a perfect fit, right?

Yes, its basically the same as way as it works on AMD-Vi and Intel VT-d.

> The problem is that we want to add those invalidation commands as early
> as possible, so that they can be processed by the hardware concurrently
> with us unmapping other pages.

I think that's a bad idea, because then you re-introduce the performance
problems again because everyone will spin on the cmd-queue lock in the
unmap path of the dma-api.

> That means adding the invalidation commands in the ->unmap callback
> and not bothering to implement ->iotlb_range_add callback at all.
> Then, we will do the sync in ->iotlb_sync. This falls apart if
> somebody decides to use iommu_flush_tlb_all(), where we would prefer
> not to insert all of the invalidation commands in unmap and instead
> insert a single invalidate-all command, followed up with a SYNC.

This problem can be solved with the deferred iova flushing code I posted
to the ML. When a queue fills up, iommu_flush_tlb_all() is called and
every entry that was unmapped before can be released. This works well on
x86, are there reasons it wouldn't on ARM?

Regards,

	Joerg