[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YGW27KFt9eQB9X2z@myrica>
Date: Thu, 1 Apr 2021 14:05:00 +0200
From: Jean-Philippe Brucker <jean-philippe@...aro.org>
To: "Liu, Yi L" <yi.l.liu@...el.com>
Cc: Jason Gunthorpe <jgg@...dia.com>,
"Tian, Kevin" <kevin.tian@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Joerg Roedel <joro@...tes.org>,
Lu Baolu <baolu.lu@...ux.intel.com>,
David Woodhouse <dwmw2@...radead.org>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
Johannes Weiner <hannes@...xchg.org>,
Jean-Philippe Brucker <jean-philippe@...aro.com>,
Alex Williamson <alex.williamson@...hat.com>,
Eric Auger <eric.auger@...hat.com>,
Jonathan Corbet <corbet@....net>,
"Raj, Ashok" <ashok.raj@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
"Jiang, Dave" <dave.jiang@...el.com>
Subject: Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and
allocation APIs
On Thu, Apr 01, 2021 at 07:04:01AM +0000, Liu, Yi L wrote:
> > - how about AMD and ARM's vSVA support? Their PASID allocation and page
> > table
> > happens within guest. They only need to bind the guest PASID table to
> > host.
In this case each VM has its own IOASID space, and the host IOASID
allocator doesn't participate. Plus this only makes sense when assigning a
whole VF to a guest, and VFIO is the tool for this. So I wouldn't shoehorn
those ops into /dev/ioasid, though we do need a transport for invalidate
commands.
> > Above model seems unable to fit them. (Jean, Eric, Jacob please feel free
> > to correct me)
> > - this per-ioasid SVA operations is not aligned with the native SVA usage
> > model. Native SVA bind is per-device.
Bare-metal SVA doesn't need /dev/ioasid either. A program uses a device
handle to either ask whether SVA is enabled, or to enable it explicitly.
With or without /dev/ioasid, that step is required. OpenCL uses the first
method - automatically enable "fine-grain system SVM" if available, and
provide a flag to userspace.
So userspace does not need to know about PASID. It's only one method for
doing SVA (some GPUs are context-switching page tables instead).
> After reading your reply in https://lore.kernel.org/linux-iommu/20210331123801.GD1463678@nvidia.com/#t
> So you mean /dev/ioasid FD is per-VM instead of per-ioasid, so above skeleton
> doesn't suit your idea. I draft below skeleton to see if our mind is the
> same. But I still believe there is an open on how to fit ARM and AMD's
> vSVA support in this the per-ioasid SVA operation model. thoughts?
>
> +-----------------------------+-----------------------------------------------+
> | userspace | kernel space |
> +-----------------------------+-----------------------------------------------+
> | ioasid_fd = | /dev/ioasid does below: |
> | open("/dev/ioasid", O_RDWR);| struct ioasid_fd_ctx { |
> | | struct list_head ioasid_list; |
> | | ... |
> | | } ifd_ctx; // ifd_ctx is per ioasid_fd |
> +-----------------------------+-----------------------------------------------+
> | ioctl(ioasid_fd, | /dev/ioasid does below: |
> | ALLOC, &ioasid); | struct ioasid_data { |
> | | ioasid_t ioasid; |
> | | struct list_head device_list; |
> | | struct list_head next; |
> | | ... |
> | | } id_data; // id_data is per ioasid |
> | | |
> | | list_add(&id_data.next, |
> | | &ifd_ctx.ioasid_list); |
> +-----------------------------+-----------------------------------------------+
> | ioctl(device_fd, | VFIO does below: |
> | DEVICE_ALLOW_IOASID, | 1) get ioasid_fd, check if ioasid_fd is valid |
> | ioasid_fd, | 2) check if ioasid is allocated from ioasid_fd|
> | ioasid); | 3) register device/domain info to /dev/ioasid |
> | | tracked in id_data.device_list |
> | | 4) record the ioasid in VFIO's per-device |
> | | ioasid list for future security check |
> +-----------------------------+-----------------------------------------------+
> | ioctl(ioasid_fd, | /dev/ioasid does below: |
> | BIND_PGTBL, | 1) find ioasid's id_data |
> | pgtbl_data, | 2) loop the id_data.device_list and tell iommu|
> | ioasid); | give ioasid access to the devices |
> +-----------------------------+-----------------------------------------------+
> | ioctl(ioasid_fd, | /dev/ioasid does below: |
> | UNBIND_PGTBL, | 1) find ioasid's id_data |
> | ioasid); | 2) loop the id_data.device_list and tell iommu|
> | | clear ioasid access to the devices |
> +-----------------------------+-----------------------------------------------+
> | ioctl(device_fd, | VFIO does below: |
> | DEVICE_DISALLOW_IOASID,| 1) check if ioasid is associated in VFIO's |
> | ioasid_fd, | device ioasid list. |
> | ioasid); | 2) unregister device/domain info from |
> | | /dev/ioasid, clear in id_data.device_list |
> +-----------------------------+-----------------------------------------------+
> | ioctl(ioasid_fd, | /dev/ioasid does below: |
> | FREE, ioasid); | list_del(&id_data.next); |
> +-----------------------------+-----------------------------------------------+
Also wondering about:
* Querying IOMMU nesting capabilities before binding page tables (which
page table formats are supported?). We were planning to have a VFIO cap,
but I'm guessing we need to go back to the sysfs solution?
* Invalidation, probably an ioasid_fd ioctl?
* Page faults, page response. From and to devices, and don't necessarily
have a PASID. But needed by vdpa as well, so that's also going through
/dev/ioasid?
Thanks,
Jean
Powered by blists - more mailing lists