[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_2922DAB6F3D5789A1CD3A21A843B4007ED09@qq.com>
Date: Sat, 23 Apr 2022 19:13:39 +0800
From: "zhangfei.gao@...mail.com" <zhangfei.gao@...mail.com>
To: Jean-Philippe Brucker <jean-philippe@...aro.org>
Cc: Fenghua Yu <fenghua.yu@...el.com>,
Dave Hansen <dave.hansen@...el.com>,
Joerg Roedel <joro@...tes.org>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
Tony Luck <tony.luck@...el.com>,
Ashok Raj <ashok.raj@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
x86 <x86@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
iommu <iommu@...ts.linux-foundation.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Andy Lutomirski <luto@...nel.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>, will@...nel.org,
robin.murphy@....com, zhangfei.gao@...aro.org
Subject: Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID
allocation and free it on mm exit
Hi, Jean
On 2022/4/22 下午11:50, Jean-Philippe Brucker wrote:
> On Fri, Apr 22, 2022 at 09:15:01PM +0800, zhangfei.gao@...mail.com wrote:
>>> I'm trying to piece together what happens from the kernel point of view.
>>>
>>> * master process with mm A opens a queue fd through uacce, which calls
>>> iommu_sva_bind_device(dev, A) -> PASID 1
>>>
>>> * master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
>>> The device is still bound to mm A with PASID 1, since the queue fd is
>>> still open.
>>> We discussed this before, but I don't remember where we left off. The
>>> child can't use the queue because its mappings are not copied on fork(),
>>> and the queue is still bound to the parent mm A. The child either needs to
>>> open a new queue or take ownership of the old one with a new uacce ioctl.
>> Yes, currently nginx aligned with the case.
>> Child process (worker process) reopen uacce,
>>
>> Master process (do init) open uacce, iommu_sva_bind_device(dev, A) -> PASID
>> 1
>> Master process fork Child (daemon) and exit.
>>
>> Child (daemon) does not use PASID 1 any more, only fork and manage worker
>> process.
>> Worker process reopen uacce, iommu_sva_bind_device(dev, B) PASID 2
>>
>> So it is expected.
> Yes, that's fine
>
>>> Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
>>> tree? This operation should unbind from A before binding to B, no?
>>> Otherwise we leak PASID 1.
>> In 5.16 PASID 1 from master is hold until nginx service stop.
>> nginx start
>> master:
>> iommu_sva_alloc_pasid mm->pasid=1 // master process
>>
>> lynx https start:
>> iommu_sva_alloc_pasid mm->pasid=2 //worker process
>>
>> nginx stop: from fops_release
>> iommu_sva_free_pasid mm->pasid=2 // worker process
>> iommu_sva_free_pasid mm->pasid=1 // master process
> That's the expected behavior (master could close its fd before forking, in
> order to free things up earlier, but it's not mandatory)
Currently we unbind in fops_release, so the ioasid allocated in master
can only be freed when nginx stop,
when all forked fd are closed.
>
>> Have one silly question.
>>
>> kerne driver
>> fops_open
>> iommu_sva_bind_device
>>
>> fops_release
>> iommu_sva_unbind_device
>>
>> application
>> main()
>> fd = open
>> return;
>>
>> Application exit but not close(fd), is it expected fops_release will be
>> called automatically by system?
> Yes, the application doesn't have to call close() explicitly, the file
> descriptor is closed automatically on exit. Note that the fd is copied on
> fork(), so it is only released once parent and all child processes exit.
Yes, in case the application ended unexpected, like ctrl+c
>
>> On 5.17
>> fops_release is called automatically, as well as iommu_sva_unbind_device.
>> On 5.18-rc1.
>> fops_release is not called, have to manually call close(fd)
> Right that's weird
Looks it is caused by the fix patch, via mmget, which may add refcount
of fd.
Some experiments
1. 5.17, everything works well.
2. 5.17 + patchset of "iommu/sva: Assign a PASID to mm on PASID
allocation and free it on mm exit"
Test application, exit without close uacce fd
First time: fops_release can be called automatically.
log:
ioasid_alloc ioasid=1
iommu_sva_alloc_pasid pasid=1
iommu_sva_bind_device handle=00000000263a2ee8
ioasid_free ioasid=1
uacce_fops_release q=0000000055ca3cdf
iommu_sva_unbind_device handle=00000000263a2ee8
Second time: hardware reports error
uacce_fops_open q=000000008e4d6f78
ioasid_alloc ioasid=1
iommu_sva_alloc_pasid pasid=1
iommu_sva_bind_device handle=00000000cfd11788
// Haredware reports error
hisi_sec2 0000:b6:00.0: qm_acc_do_task_timeout [error status=0x20] found
hisi_sec2 0000:b6:00.0: qm_acc_wb_not_ready_timeout [error status=0x40]
found
hisi_sec2 0000:b6:00.0: sec_fsm_hbeat_rint [error status=0x20] found
hisi_sec2 0000:b6:00.0: Controller resetting...
hisi_sec2 0000:b6:00.0: QM mailbox operation timeout!
hisi_sec2 0000:b6:00.0: Failed to dump sqc!
hisi_sec2 0000:b6:00.0: Failed to drain out data for stopping!
hisi_sec2 0000:b6:00.0: Bus lock! Please reset system.
hisi_sec2 0000:b6:00.0: Controller reset failed (-110)
hisi_sec2 0000:b6:00.0: controller reset failed (-110)
3. Add the fix patch of using mmget in bind.
Test application, exit without close uacce fd
fops_release can NOT be called automatically, looks mmget adds refcount
of fd.
So the fix method of using mmget blocks fops_release to be called once
fd is closed without unbind.
>> Since nginx may have a issue, it does not call close(fd) when nginx -s quit.
> And you're sure that none of the processes are still alive or in zombie
> state? Just to cover every possibility.
It can also reproduced by a simple application exit without close(uacce_fd)
Thanks
Powered by blists - more mailing lists