linux-kernel - Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210511091452.721e9a03@jacob-builder>
Date:   Tue, 11 May 2021 09:14:52 -0700
From:   Jacob Pan <jacob.jun.pan@...ux.intel.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        iommu@...ts.linux-foundation.org, Joerg Roedel <joro@...tes.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Jean-Philippe Brucker <jean-philippe@...aro.com>,
        Christoph Hellwig <hch@...radead.org>,
        Yi Liu <yi.l.liu@...el.com>, Raj Ashok <ashok.raj@...el.com>,
        "Tian, Kevin" <kevin.tian@...el.com>,
        Dave Jiang <dave.jiang@...el.com>, wangzhou1@...ilicon.com,
        zhangfei.gao@...aro.org, vkoul@...nel.org,
        jacob.jun.pan@...ux.intel.com,
        David Woodhouse <dwmw2@...radead.org>
Subject: Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit
 flags

Hi Jason,

On Tue, 11 May 2021 08:48:48 -0300, Jason Gunthorpe <jgg@...dia.com> wrote:

> On Mon, May 10, 2021 at 08:31:45PM -0700, Jacob Pan wrote:
> > Hi Jason,
> > 
> > On Mon, 10 May 2021 20:37:49 -0300, Jason Gunthorpe <jgg@...dia.com>
> > wrote: 
> > > On Mon, May 10, 2021 at 06:25:07AM -0700, Jacob Pan wrote:
> > >   
> > > > +/*
> > > > + * The IOMMU_SVA_BIND_SUPERVISOR flag requests a PASID which can be
> > > > used only
> > > > + * for access to kernel addresses. No IOTLB flushes are
> > > > automatically done
> > > > + * for kernel mappings; it is valid only for access to the kernel's
> > > > static
> > > > + * 1:1 mapping of physical memory — not to vmalloc or even module
> > > > mappings.
> > > > + * A future API addition may permit the use of such ranges, by
> > > > means of an
> > > > + * explicit IOTLB flush call (akin to the DMA API's unmap method).
> > > > + *
> > > > + * It is unlikely that we will ever hook into
> > > > flush_tlb_kernel_range() to
> > > > + * do such IOTLB flushes automatically.
> > > > + */
> > > > +#define IOMMU_SVA_BIND_SUPERVISOR       BIT(0)    
> > > 
> > > Huh? That isn't really SVA, can you call it something saner please?
> > >   
> > This is shared kernel virtual address, I am following the SVA lib naming
> > since this is where the flag will be used. Why this is not SVA? Kernel
> > virtual address is still virtual address. Is it due to direct map?  
> 
> As the above explains it doesn't actually synchronize the kernel's
> address space it just shoves the direct map into the IOMMU.
> 
There is no duplicated kernel direct map in IOMMU.

> I suppose a different IOMMU implementation might point the PASID directly
> at the kernel's page table and avoid those limitations - but since
> that isn't portable it seems irrelevant.
> 
This is what we are doing here. We allocate a supervisor PASID and put
the kernel page table (init_mm pgd) in this PASID entry.

> Since the only thing it really maps is the direct map I would just
> call it direct_map, or all physical or something.
> 
Good idea. It makes things clear to the callers. They must only use direct
map memory for DMA.

> How does this interact with the DMA APIs?
DMA API would use RID2PASID (PASID 0), so it is separated by PASIDs.

> How do you get CPU cache
> flushing/etc into PASID operations that don't trigger IOMMU updates?
> 
Sorry, I am not following. This is used for direct map only.

> Honestly, I'm not convinced we should have "kernel SVA" at all.. Why
> does IDXD use normal DMA on the RID for kernel controlled accesses?
> 
Using SVA simplifies the work submission, there is no need to do map/unmap.
Just bind PASID with init_mm, then submit work directly either with ENQCMDS
(supervisor version of ENQCMD) to a shared workqueue or put the supervisor
PASID in the descriptor for dedicated workqueue.

> > > Is it really a PASID that always has all of physical memory mapped
> > > into it? Sounds dangerous. What is it for?  
> > 
> > Yes. It is to bind DMA request w/ PASID with init_mm/init_top_pgt. Per
> > PCIe spec PASID TLP prefix has "Privileged Mode Requested" bit. VT-d
> > supports this with "Privileged-mode-Requested (PR) flag (to distinguish
> > user versus supervisor access)". Each PASID entry has a SRE (Supervisor
> > Request Enable) bit.  
> 
> The PR flag is only needed if the underlying IOMMU is directly
> processing the CPU page tables. For cases where the IOMMU is using its
> own page table format and has its own copies the PR flag shouldn't be
> used.
> 
We are doing the former case. There is no IOMMU page tables for the direct
map.

> Jason


Thanks,

Jacob