lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Sep 2021 21:17:19 +0300
From:   Leon Romanovsky <leonro@...dia.com>
To:     Jason Gunthorpe <jgg@...dia.com>
CC:     Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "mgurtovoy@...dia.com" <mgurtovoy@...dia.com>,
        liulongfang <liulongfang@...wei.com>,
        "Zengtao (B)" <prime.zeng@...ilicon.com>,
        Jonathan Cameron <jonathan.cameron@...wei.com>,
        "Wangzhou (B)" <wangzhou1@...ilicon.com>
Subject: Re: [PATCH v3 6/6] hisi_acc_vfio_pci: Add support for VFIO live
 migration

On Mon, Sep 27, 2021 at 01:06:27PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 27, 2021 at 07:00:23PM +0300, Leon Romanovsky wrote:
> > On Mon, Sep 27, 2021 at 12:01:19PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Sep 27, 2021 at 01:46:31PM +0000, Shameerali Kolothum Thodi wrote:
> > > 
> > > > > > > Nope, this is locked wrong and has no lifetime management.
> > > > > >
> > > > > > Ok. Holding the device_lock() sufficient here?
> > > > > 
> > > > > You can't hold a hisi_qm pointer with some kind of lifecycle
> > > > > management of that pointer. device_lock/etc is necessary to call
> > > > > pci_get_drvdata()
> > > > 
> > > > Since this migration driver only supports VF devices and the PF
> > > > driver will not be removed until all the VF devices gets removed,
> > > > is the locking necessary here?
> > > 
> > > Oh.. That is really busted up. pci_sriov_disable() is called under the
> > > device_lock(pf) and obtains the device_lock(vf).
> > 
> > Yes, indirectly, but yes.
> > 
> > > 
> > > This means a VF driver can never use the device_lock(pf), otherwise it
> > > can deadlock itself if PF removal triggers VF removal.
> > 
> > VF can use pci_dev_trylock() on PF to prevent PF removal.
> 
> no, no here, the device_lock is used in too many places for a trylock
> to be appropriate
> 
> > > 
> > > But you can't access these members without using the device_lock(), as
> > > there really are no safety guarentees..
> > > 
> > > The mlx5 patches have this same sketchy problem.
> > > 
> > > We may need a new special function 'pci_get_sriov_pf_devdata()' that
> > > confirms the vf/pf relationship and explicitly interlocks with the
> > > pci_sriov_enable/disable instead of using device_lock()
> > > 
> > > Leon, what do you think?
> > 
> > I see pci_dev_lock() and similar functions, they are easier to
> > understand that specific pci_get_sriov_pf_devdata().
> 
> That is just a wrapper for device_lock - it doesnt help anything
> 
> The point is to all out a different locking regime that relies on the
> sriov enable/disable removing the VF struct devices

You can't avoid trylock, because this pci_get_sriov_pf_devdata() will be
called in VF where it already holds lock, so attempt to take PF lock
will cause to deadlock.

PCI code assumes that PF lock is taken first, and VF lock is second.

Thanks

> 
> Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ