[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f095ec27-cd35-0d00-3a86-bdff69268371@huawei.com>
Date: Thu, 23 Feb 2023 23:09:03 +0800
From: Weifeng Su <suweifeng1@...wei.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: <mst@...hat.com>, <linux-kernel@...r.kernel.org>,
<shikemeng@...wei.com>, <liuzhiqiang26@...wei.com>,
<linfeilong@...wei.com>, <zhanghongtao22@...wei.com>
Subject: Re: [PATCH v2] uio:uio_pci_generic:Don't clear master bit when the
process does not exit
On 2023/2/21 20:48, Weifeng Su wrote:
> On 2023/2/21 2:11, Greg KH wrote:
>> On Tue, Feb 21, 2023 at 01:10:44AM +0800, Su Weifeng wrote:
>>> From: Weifeng Su <suweifeng1@...wei.com>
>>>
>>> The /dev/uioX device has concurrent operations in a few scenarios.
>>>
>>> For example, when a process using the device exits abnormally,
>>> the management program starts the same process to operate the device.
>>> When the process exits and closes the /dev/uioX device,
>>> the master bit of the device is cleared. In this case, if the
>>> new process is issuing commands, I/Os are suspended and cannot be
>>> automatically recovered.
>>>
>>> Therefore, reference counting is added to clear the master bit
>>> only when the last process exits.
>>>
>>> Signed-off-by: Weifeng Su <suweifeng1@...wei.com>
>>> ---
>>> The difference between the first patch and the first patch is that
>>> the reference counting operation is performed using the atomic
>>> semantics,
>>> just like other drivers under UIO:
>>> cdfa835c6e5e87d145f("uio_hv_generic: defer opening vmbus until first
>>> use").
>>
>> And I would claim that that change too is incorrect.
>>
>> Did you test this with dup()? Lots of open/close cycles on the same
>> device node? Passing around the file descriptor?
> The patch is verified to be able to fix the low-probability problem in
> our scenario. At the same time, we also perform the dup test on the
> latest kernel(6.2.0-rc8-g0f5e65cd8f9b-dirty) based on your suggestions.
> The test code model is as follows:
> ...
> int main()
> {
> int fd = open("/dev/uio0", O_RDWR);
> if (fd < 0) {
> printf("error in open /dev/uio0\n");
> return -1;
> }
> int dup_fd = dup(fd);
> while (true) {
> sleep(1);
> }
> }
>
> After kill the process, The test results are as follows:
> [ 335.730916] swf call uio open
> [ 338.307306] swf call uio release
>
> dup does not cause uio_pci_generic.c:open or uio_pci_generic.c:release
> reference counting exceptions.
> PS: In this example, a print is added to the entries of uio_open and
> uio_release of the driver to confirm the function invoking status.
>
> After reading the "dup" code, we confirm that the struct file uses the
> f_count protect the same file in the process. The file opened by the
> "dup" only adds reference counting and does not invoke the open
> operation of the driver. Disabling the fd only decrementes the f_count
> count. The release function of the driver is invoked to clear resources
> only when f_count is 0. Different processes use different struct files
> to open the same file, and f_count cannot enable the constraint
> function. In this case, the driver needs to handle the concurrency
> problem of multiple processes,It's like this patch
> cdfa835c6e5e87d145f("uio_hv_generic: defer opening vmbus until first use")
>>
>> Logically, this is identical to your previous change, so why should it
>> be accepted?
>>
>> Again, why not just use a real PCI driver for your device?
> We use the uio_pci_generic driver because we use the DPDK, which is a
> user-mode development platform on which you can develop the user-mode
> driver.
>>
>> thanks,
>>
>> greg k-h
>
> Best regards,
> Weifeng Su
Hi All,
Do I need to perform more tests on the patch based on the v2 version to
verify it?
Best regards,
Weifeng Su
Powered by blists - more mailing lists