lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240522183046.GG20229@nvidia.com>
Date: Wed, 22 May 2024 15:30:46 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: Andrew Jones <ajones@...tanamicro.com>, Yan Zhao <yan.y.zhao@...el.com>,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	kevin.tian@...el.com, yishaih@...dia.com,
	shameerali.kolothum.thodi@...wei.com, Peter Xu <peterx@...hat.com>
Subject: Re: [PATCH] vfio/pci: take mmap write lock for io_remap_pfn_range

On Wed, May 22, 2024 at 11:50:06AM -0600, Alex Williamson wrote:
> I'm not sure if there are any outstanding blockers on Peter's side, but
> this seems like a good route from the vfio side.  If we're seeing this
> now without lockdep, we might need to bite the bullet and take the hit
> with vmf_insert_pfn() while the pmd/pud path learn about pfnmaps.

There is another alternative...

Ideally we wouldn't use the fault handler.

Instead when the MMIO becomes available again we'd iterate over all
the VMA's and do remap_pfn_range(). When the MMIO becomes unavailable
we do unmap_mapping_range() and remove it. This whole thing is
synchronous and the fault handler should simply trigger SIGBUS if
userspace races things.

unmap_mapping_range() is easy, but the remap side doesn't have a
helper today..

Something grotesque like this perhaps?

	while (1) {
		struct mm_struct *cur_mm = NULL;

		i_mmap_lock_read(mapping);
		vma_interval_tree_foreach(vma, mapping->i_mmap, 0, ULONG_MAX) {
			if (vma_populated(vma))
				continue;

			cur_mm = vm->mm_struct;
			mmgrab(cur_mm);
		}
		i_mmap_unlock_read(mapping);

		if (!cur_mm)
			return;

		mmap_write_lock(cur_mm);
		i_mmap_lock_read(mapping);
		vma_interval_tree_foreach(vma, mapping->i_mmap, 0, ULONG_MAX) {
			if (vma->mm_struct == mm)
				vfio_remap_vma(vma);
		}
		i_mmap_unlock_read(mapping);
		mmap_write_unlock(cur_mm);
		mmdrop(cur_mm);
	}

I'm pretty sure we need to hold the mmap_write lock to do the
remap_pfn..

vma_populated() would have to do a RCU read of the page table to check
if the page 0 is present.

Also there is a race in mmap if you call remap_pfn_range() from the
mmap fops and also use unmap_mapping_range(). mmap_region() does
call_mmap() before it does vma_link_file() which gives a window where
the VMA is populated but invisible to unmap_mapping_range(). We'd need
to add another fop to call after vma_link_file() to populate the mmap
or rely exclusively on the fault handler to populate.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ