[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jGxsB5so8mKqYrsn2CEc7nO2yPvzZZ_mvM_-R=BZfKHg@mail.gmail.com>
Date: Wed, 7 Oct 2020 15:23:57 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Daniel Vetter <daniel.vetter@...ll.ch>
Cc: DRI Development <dri-devel@...ts.freedesktop.org>,
LKML <linux-kernel@...r.kernel.org>,
KVM list <kvm@...r.kernel.org>, Linux MM <linux-mm@...ck.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
linux-samsung-soc <linux-samsung-soc@...r.kernel.org>,
"Linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
linux-s390 <linux-s390@...r.kernel.org>,
Daniel Vetter <daniel.vetter@...el.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Kees Cook <keescook@...omium.org>,
Andrew Morton <akpm@...ux-foundation.org>,
John Hubbard <jhubbard@...dia.com>,
Jérôme Glisse <jglisse@...hat.com>,
Jan Kara <jack@...e.cz>, Bjorn Helgaas <bhelgaas@...gle.com>,
Linux PCI <linux-pci@...r.kernel.org>
Subject: Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 12:49 PM Daniel Vetter <daniel.vetter@...ll.ch> wrote:
>
> On Wed, Oct 7, 2020 at 9:33 PM Dan Williams <dan.j.williams@...el.com> wrote:
> >
> > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter <daniel.vetter@...ll.ch> wrote:
> > >
> > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > > the default for all driver uses.
> > >
> > > Except there's two more ways to access pci bars: sysfs and proc mmap
> > > support. Let's plug that hole.
> >
> > Ooh, yes, lets.
> >
> > > For revoke_devmem() to work we need to link our vma into the same
> > > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > > at ->open time, but that's a bit tricky here with all the entry points
> > > and arch code. So instead create a fake file and adjust vma->vm_file.
> >
> > I don't think you want to share the devmem inode for this, this should
> > be based off the sysfs inode which I believe there is already only one
> > instance per resource. In contrast /dev/mem can have multiple inodes
> > because anyone can just mknod a new character device file, the same
> > problem does not exist for sysfs.
>
> But then I need to find the right one, plus I also need to find the
> right one for the procfs side. That gets messy, and I already have no
> idea how to really test this. Shared address_space is the same trick
> we're using in drm (where we have multiple things all pointing to the
> same underlying resources, through different files), and it gets the
> job done. So that's why I figured the shared address_space is the
> cleaner solution since then unmap_mapping_range takes care of
> iterating over all vma for us. I guess I could reimplement that logic
> with our own locking and everything in revoke_devmem, but feels a bit
> silly. But it would also solve the problem of having mutliple
> different mknod of /dev/kmem with different address_space behind them.
> Also because of how remap_pfn_range works, all these vma do use the
> same pgoff already anyway.
True, remap_pfn_range() makes sure that ->pgoff is an absolute
physical address offset for all use cases. So you might be able to
just point proc_bus_pci_open() at the shared devmem address space. For
sysfs it's messier. I think you would need to somehow get the inode
from kernfs_fop_open() to adjust its address space, but only if the
bin_file will ultimately be used for PCI memory.
Powered by blists - more mailing lists