lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210225175457.GD250483@xz-x1>
Date:   Thu, 25 Feb 2021 12:54:57 -0500
From:   Peter Xu <peterx@...hat.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Alex Williamson <alex.williamson@...hat.com>, cohuck@...hat.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 10/10] vfio/type1: Register device notifier

On Wed, Feb 24, 2021 at 08:22:16PM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 24, 2021 at 02:55:08PM -0700, Alex Williamson wrote:
> 
> > > > +static bool strict_mmio_maps = true;
> > > > +module_param_named(strict_mmio_maps, strict_mmio_maps, bool, 0644);
> > > > +MODULE_PARM_DESC(strict_mmio_maps,
> > > > +		 "Restrict to safe DMA mappings of device memory (true).");  
> > > 
> > > I think this should be a kconfig, historically we've required kconfig
> > > to opt-in to unsafe things that could violate kernel security. Someone
> > > building a secure boot trusted kernel system should not have an
> > > options for userspace to just turn off protections.
> > 
> > It could certainly be further protected that this option might not
> > exist based on a Kconfig, but I think we're already risking breaking
> > some existing users and I'd rather allow it with an opt-in (like we
> > already do for lack of interrupt isolation), possibly even with a
> > kernel taint if used, if necessary.
> 
> Makes me nervous, security should not be optional.
> 
> > > I'd prefer this was written a bit differently, I would like it very
> > > much if this doesn't mis-use follow_pte() by returning pfn outside
> > > the lock.
> > > 
> > > vaddr_get_bar_pfn(..)
> > > {
> > >         vma = find_vma_intersection(mm, vaddr, vaddr + 1);
> > > 	if (!vma)
> > >            return -ENOENT;
> > >         if ((vma->vm_flags & VM_DENYWRITE) && (prot & PROT_WRITE)) // Check me
> > >            return -EFAULT;
> > >         device = vfio_device_get_from_vma(vma);
> > > 	if (!device)
> > >            return -ENOENT;
> > > 
> > > 	/*
> > >          * Now do the same as vfio_pci_mmap_fault() - the vm_pgoff must
> > > 	 * be the physical pfn when using this mechanism. Delete follow_pte entirely()
> > >          */
> > >         pfn = (vaddr - vma->vm_start)/PAGE_SIZE + vma->vm_pgoff
> > > 	
> > >         /* de-dup device and record that we are using device's pages in the
> > > 	   pfnmap */
> > >         ...
> > > }
> > 
> > 
> > This seems to undo both:
> > 
> > 5cbf3264bc71 ("vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()")
> 
> No, the bug this commit described is fixed by calling
> vfio_device_get_from_vma() which excludes all non-VFIO VMAs already.
> 
> We can assert that the vm_pgoff is in a specific format because it is
> a VFIO owned VMA and must follow the rules to be part of the address
> space. See my last email
> 
> Here I was suggesting to use the vm_pgoff == PFN rule, but since
> you've clarified that doesn't work we'd have to determine the PFN from
> the region number through the vfio_device instead.
> 
> > (which also suggests we are going to break users without the module
> > option opt-in above)
> 
> Not necessarily, this is complaining vfio crashes, it doesn't say they
> actually needed the IOMMU to work on those VMAs because they are doing
> P2P DMA.
> 
> I think, if this does break someone, they are on a real fringe and
> must have already modified their kernel, so a kconfig is the right
> approach. It is pretty hard to get non-GUP'able DMA'able memory into a
> process with the stock kernel.
> 
> Generally speaking, I think Linus has felt security bug fixes like
> this are more on the OK side of things to break fringe users.
> 
> > And:
> > 
> > 41311242221e ("vfio/type1: Support faulting PFNMAP vmas")
> > 
> > So we'd have an alternate path in the un-safe mode and we'd lose the
> > ability to fault in mappings.
> 
> As above we already exclude VMAs that are not from VFIO, and VFIO
> sourced VMA's do not meaningfully implement fault for this use
> case. So calling fixup_user_fault() is pointless.
> 
> Peter just did this so we could ask him what it was for..
> 
> I feel pretty strongly that removing the call to follow_pte is
> important here. Even if we do cover all the issues with mis-using the
> API it just makes a maintenance problem to leave it in.

I can't say I fully understand the whole rational behind 5cbf3264bc71, but that
commit still sounds reasonable to me, since I don't see why VFIO cannot do
VFIO_IOMMU_MAP_DMA upon another memory range that's neither anonymous memory
nor vfio mapped MMIO range.  In those cases, vm_pgoff namespace defined by vfio
may not be true anymore, iiuc.

Then if with that follow_pfn() for non-vfio mappings, it seems also very
reasonable to have 41311242221e or similar as proposed by Alex to make sure pte
installed before calling that, for either vfio or other vma providers.

Or does it mean that we don't want to allow VFIO dma to those unknown memory
backends, for some reason?

Thanks,

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ