[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gK82tpNWqwF-CFGPWU99WU-Sd84Y79zuQxMfZh1efoMQ@mail.gmail.com>
Date: Thu, 8 Oct 2020 01:35:41 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Daniel Vetter <daniel.vetter@...ll.ch>
Cc: Jason Gunthorpe <jgg@...pe.ca>,
DRI Development <dri-devel@...ts.freedesktop.org>,
LKML <linux-kernel@...r.kernel.org>,
KVM list <kvm@...r.kernel.org>, Linux MM <linux-mm@...ck.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
linux-samsung-soc <linux-samsung-soc@...r.kernel.org>,
"Linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
linux-s390 <linux-s390@...r.kernel.org>,
Daniel Vetter <daniel.vetter@...el.com>,
Kees Cook <keescook@...omium.org>,
Andrew Morton <akpm@...ux-foundation.org>,
John Hubbard <jhubbard@...dia.com>,
Jérôme Glisse <jglisse@...hat.com>,
Jan Kara <jack@...e.cz>, Bjorn Helgaas <bhelgaas@...gle.com>,
Linux PCI <linux-pci@...r.kernel.org>
Subject: Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Thu, Oct 8, 2020 at 1:13 AM Daniel Vetter <daniel.vetter@...ll.ch> wrote:
>
> On Thu, Oct 8, 2020 at 9:50 AM Dan Williams <dan.j.williams@...el.com> wrote:
> >
> > On Wed, Oct 7, 2020 at 4:25 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
> > >
> > > On Wed, Oct 07, 2020 at 12:33:06PM -0700, Dan Williams wrote:
> > > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter <daniel.vetter@...ll.ch> wrote:
> > > > >
> > > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > > > > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > > > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > > > > the default for all driver uses.
> > > > >
> > > > > Except there's two more ways to access pci bars: sysfs and proc mmap
> > > > > support. Let's plug that hole.
> > > >
> > > > Ooh, yes, lets.
> > > >
> > > > >
> > > > > For revoke_devmem() to work we need to link our vma into the same
> > > > > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > > > > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > > > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > > > > at ->open time, but that's a bit tricky here with all the entry points
> > > > > and arch code. So instead create a fake file and adjust vma->vm_file.
> > > >
> > > > I don't think you want to share the devmem inode for this, this should
> > > > be based off the sysfs inode which I believe there is already only one
> > > > instance per resource. In contrast /dev/mem can have multiple inodes
> > > > because anyone can just mknod a new character device file, the same
> > > > problem does not exist for sysfs.
> > >
> > > The inode does not come from the filesystem char/mem.c creates a
> > > singular anon inode in devmem_init_inode()
> >
> > That's not quite right, An inode does come from the filesystem I just
> > arranged for that inode's i_mapping to be set to a common instance.
> >
> > > Seems OK to use this more widely, but it feels a bit weird to live in
> > > char/memory.c.
> >
> > Sure, now that more users have arrived it should move somewhere common.
> >
> > > This is what got me thinking maybe this needs to be a bit bigger
> > > generic infrastructure - eg enter this scheme from fops mmap and
> > > everything else is in mm/user_iomem.c
> >
> > It still requires every file that can map physical memory to have its
> > ->open fop do
> >
> > inode->i_mapping = devmem_inode->i_mapping;
> > filp->f_mapping = inode->i_mapping;
> >
> > I don't see how you can centralize that part.
>
> btw, why are you setting inode->i_mapping? The inode is already
> published, changing that looks risky. And I don't think it's needed,
> vma_link() only looks at filp->f_mapping, and in our drm_open() we
> only set that one.
I think you're right it is unnecessary for devmem, but I don't think
it's dangerous to do it from the very first open before anything is
using the address space. It's copy-paste from what all the other
"shared address space" implementers do. For example, block-devices in
bd_acquire(). However, the rationale for block_devices to do it is so
that page cache pages can be associated with the address space in the
absence of an f_mapping. Without filesystem page writeback to
coordinate I don't see any devmem code paths that would operate on the
inode->i_mapping.
Powered by blists - more mailing lists