lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YlayFROf5P294P/P@kroah.com>
Date:   Wed, 13 Apr 2022 13:20:53 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Yao Hongbo <yaohongbo@...ux.alibaba.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        alikernel-developer@...ux.alibaba.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] uio/uio_pci_generic: Introduce refcnt on open/release

On Wed, Apr 13, 2022 at 07:09:57PM +0800, Yao Hongbo wrote:
> 
> 在 2022/4/13 下午5:43, Greg KH 写道:
> > On Wed, Apr 13, 2022 at 05:25:40PM +0800, Yao Hongbo wrote:
> > > 在 2022/4/13 下午4:51, Michael S. Tsirkin 写道:
> > > > On Wed, Apr 13, 2022 at 09:33:17AM +0200, Greg KH wrote:
> > > > > On Wed, Apr 13, 2022 at 03:01:42PM +0800, Yao Hongbo wrote:
> > > > > > If two userspace programs both open the PCI UIO fd, when one
> > > > > > of the program exits uncleanly, the other will cause IO hang
> > > > > > due to bus-mastering disabled.
> > > > > > 
> > > > > > It's a common usage for spdk/dpdk to use UIO. So, introduce refcnt
> > > > > > to avoid such problems.
> > > > > Why do you have multiple userspace programs opening the same device?
> > > > > Shouldn't they coordinate?
> > > > Or to restate, I think the question is, why not open the device
> > > > once and pass the FD around?
> > > Hmm, it will have the same result, no matter  whether opening the same
> > > device or pass the FD around.
> > How?  You only open once, and close once.  Where is the multiple closes?
> > 
> > > Our expectation is that even if the primary process exits abnormally,  the
> > > second process can still send
> > > 
> > > or receive data.
> > Then use the same file descriptor.
> 
> 
> Yes, we can use the same file descriptor.
> 
> but since the pcie bus-master  has been disabled by the primary process,
> 
> the seconday process cannot continue to operate.

Really?  With the same file descriptor?  Try it and see.  release should
only be called when the file descriptor is closed.

> > > The impact of disabling pci bus-master is relatively large, and we should
> > > make some restrictions on
> > > this behavior.
> > Why?  UIO is "you better really really know what you are doing to use
> > this interface", right?  Just duplicate the fd and pass it around if you
> > must have multiple accesses to the same device.
> > 
> > And again, this will be a functional change.  How can you handle your
> > userspace on older kernels if you make this change?
> 
> Without this change, our userspace cannot work properly on older kernels.

What change broke your userspace?

> Our userspace only use the "multi process mode" feature of the spdk.
> 
> The SPDK links:
> https://spdk.io/doc/app_overview.html
> 
> "Multi process mode
> When --shm-id is specified, the application is started in multi-process
> mode.
> 
> Applications using the same shm-id share their memory and NVMe devices.
> 
> The first app to start with a given id becomes a primary process, with the
> rest,
> 
> called secondary processes, only attaching to it. When the primary process
> exits,
> 
> the secondary ones continue to operate, but no new processes can be attached
> 
> at this point. All processes within the same shm-id group must use the same
> --single-file-segments setting."

Please work with the spdk users, I know nothing about that mess, sorry.

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ