lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Jun 2022 14:18:07 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Logan Gunthorpe <logang@...tatee.com>
Cc:     linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-block@...r.kernel.org, linux-pci@...r.kernel.org,
        linux-mm@...ck.org, iommu@...ts.linux-foundation.org,
        Stephen Bates <sbates@...thlin.com>,
        Christoph Hellwig <hch@....de>,
        Dan Williams <dan.j.williams@...el.com>,
        Christian König <christian.koenig@....com>,
        John Hubbard <jhubbard@...dia.com>,
        Don Dutile <ddutile@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Jakowski Andrzej <andrzej.jakowski@...el.com>,
        Minturn Dave B <dave.b.minturn@...el.com>,
        Jason Ekstrand <jason@...kstrand.net>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Xiong Jianxin <jianxin.xiong@...el.com>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Ira Weiny <ira.weiny@...el.com>,
        Robin Murphy <robin.murphy@....com>,
        Martin Oliveira <martin.oliveira@...eticom.com>,
        Chaitanya Kulkarni <ckulkarnilinux@...il.com>,
        Ralph Campbell <rcampbell@...dia.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v6 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()

On Thu, Jun 02, 2022 at 10:45:55AM -0600, Logan Gunthorpe wrote:
> 
> 
> 
> On 2022-06-02 10:30, Jason Gunthorpe wrote:
> > On Thu, Jun 02, 2022 at 10:16:10AM -0600, Logan Gunthorpe wrote:
> > 
> >>> Just stuff the pages into the mmap, and your driver unprobe will
> >>> automatically block until all the mmaps are closed - no different than
> >>> having an open file descriptor or something.
> >>
> >> Oh is that what we want?
> > 
> > Yes, it is the typical case - eg if you have a sysfs file open unbind
> > hangs indefinitely. Many drivers can't unbind while they have open file
> > descriptors/etc.
> > 
> > A couple drivers go out of their way to allow unbinding while a live
> > userspace exists but this can get complicated. Usually there should be
> > a good reason.
> 
> This is not my experience. All the drivers I've worked with do not block
> unbind with open file descriptors (at least for char devices). I know,
> for example, that having a file descriptor open of /dev/nvmeX does not
> cause unbinding to block.

So there are lots of bugs in the kernel, and I've seen many drivers
that think calling cdev_device_del() is all they need to do - and then
happily allow cdev ioctl's/etc on a de-initialized driver struct.

Drivers that do take care of this usually have to put a lock around
all their fops to serialize against unbind. RDMA uses SRCU, iirc TPM
used a rwlock. But this is tricky and hurts fops performance.

I don't know what nvme did to protect against this, I didn't notice
an obvious lock.

> I figured this was the expectation as the userspace process doing
> the unbind won't be able to be interrupted seeing there's no way to
> fail on that path. Though, it certainly would make things a lot
> easier if the unbind can block indefinitely as it usually requires
> some complicated locking.

As I said, this is what sysfs does today and I don't see that ever
changing. If you userspace has a sysfs file open then the driver
unbind hangs until the file is closed.

So, doing as bad as sysfs seems like a reasonable baseline to me.

> Do you have an example of this? What mechanisms are developers using to
> block unbind with open file descriptors?

Sysfs maintains a refcount with a bias that is basically a fancied
rwlock. Most places use some kind of refcount triggering a
completion. Sleep on the completion until refcount is 0 on unbind kind
of thing.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ