lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6fb8bed5-8d40-fd63-4537-44e9eb6aa053@linaro.org>
Date:   Wed, 22 Jun 2022 16:14:45 +0800
From:   Zhangfei Gao <zhangfei.gao@...aro.org>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Jean-Philippe Brucker <jean-philippe@...aro.org>,
        Arnd Bergmann <arnd@...db.de>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Wangzhou <wangzhou1@...ilicon.com>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        linux-accelerators@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
        linux-crypto@...r.kernel.org, iommu@...ts.linux-foundation.org,
        Yang Shen <shenyang39@...wei.com>
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

Hi, Greg

On 2022/6/21 下午3:44, Greg Kroah-Hartman wrote:
> On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
>>
>> On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
>>> On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
>>>> On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
>>>>>> The refcount only ensures that the uacce_device object is not freed as
>>>>>> long as there are open fds. But uacce_remove() can run while there are
>>>>>> open fds, or fds in the process of being opened. And atfer uacce_remove()
>>>>>> runs, the uacce_device object still exists but is mostly unusable. For
>>>>>> example once the module is freed, uacce->ops is not valid anymore. But
>>>>>> currently uacce_fops_open() may dereference the ops in this case:
>>>>>>
>>>>>> 	uacce_fops_open()
>>>>>> 	 if (!uacce->parent->driver)
>>>>>> 	 /* Still valid, keep going */		
>>>>>> 	 ...					rmmod
>>>>>> 						 uacce_remove()
>>>>>> 	 ...					 free_module()
>>>>>> 	 uacce->ops->get_queue() /* BUG */
>>>>> uacce_remove should wait for uacce->queues_lock, until fops_open release the
>>>>> lock.
>>>>> If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
>>>>> should fail.
>>>> Ah yes sorry, I lost sight of what this patch was adding. But we could
>>>> have the same issue with the patch, just in a different order, no?
>>>>
>>>> 	uacce_fops_open()
>>>> 	 uacce = xa_load()
>>>> 	 ...					rmmod
>>> Um, how is rmmod called if the file descriptor is open?
>>>
>>> That should not be possible if the owner of the file descriptor is
>>> properly set.  Please fix that up.
>> Thanks Greg
>>
>> Set cdev owner or use module_get/put can block rmmod once fops_open.
>> -       uacce->cdev->owner = THIS_MODULE;
>> +       uacce->cdev->owner = uacce->parent->driver->owner;
>>
>> However, still not find good method to block removing parent pci device.
>>
>> $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
>>
>> [   32.563350]  uacce_remove+0x6c/0x148
>> [   32.563353]  hisi_qm_uninit+0x12c/0x178
>> [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
>> [   32.563361]  pci_device_remove+0x44/0xd8
>> [   32.563364]  device_remove+0x54/0x88
>> [   32.563367]  device_release_driver_internal+0xec/0x1a0
>> [   32.563370]  device_release_driver+0x20/0x30
>> [   32.563372]  pci_stop_bus_device+0x8c/0xe0
>> [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
>> [   32.563378]  remove_store+0x9c/0xb0
>> [   32.563379]  dev_attr_store+0x20/0x38
> Removing the parent pci device does not remove the module code, it
> removes the device itself.  Don't confuse code vs. data here.

Do you mean even parent pci device is removed immediately, the code has 
to wait, like dma etc?

Currently parent driver has to ensure all dma stopped then call 
uacce_remove,
ie, after uacce_fops_open succeed, parent driver need wait fops_release, 
then uacce_remove can be called.
For example:
drivers/crypto/hisilicon/zip/zip_main.c:
hisi_qm_wait_task_finish

If remove this wait , there may other issue,
Unable to handle kernel paging request at virtual address ffff80000b700204
pc : hisi_qm_cache_wb.part.0+0x2c/0xa0

So uacce only need serialize uacce_fops_open and uacce_remove.
After uacce_fops_open, we can assume uacce_remove only happen after 
uacce_fops_release?
Then it would be much simpler.

Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ