lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240308140440.GN9179@nvidia.com>
Date: Fri, 8 Mar 2024 10:04:40 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux.dev, joro@...tes.org,
	yi.l.liu@...el.com, kevin.tian@...el.com, nicolinc@...dia.com,
	eric.auger@...hat.com, vasant.hegde@....com, jon.grimm@....com,
	santosh.shukla@....com, Dhaval.Giani@....com, pandoh@...gle.com,
	loganodell@...gle.com
Subject: Re: [RFCv2 PATCH 6/7] iommu/amd: Add nested domain allocation support

On Thu, Jan 11, 2024 at 06:06:45PM -0600, Suravee Suthikulpanit wrote:
> @@ -2367,8 +2372,9 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
>  	domain->nid = NUMA_NO_NODE;
>  
>  	switch (type) {
> -	/* No need to allocate io pgtable ops in passthrough mode */
> +	/* No need to allocate io pgtable ops in passthrough and nested mode */
>  	case IOMMU_DOMAIN_IDENTITY:
> +	case IOMMU_DOMAIN_NESTED:

These constants should not show up in the driver, it needs to be
reorganized to use the new allocation APIs to avoid this.

> -static struct iommu_domain *do_iommu_domain_alloc(unsigned int type,
> +static const struct iommu_domain_ops nested_domain_ops = {
> +	.attach_dev = amd_iommu_attach_device,
> +	.free = amd_iommu_domain_free,
> +};

I would expect nesting to have its own attach function too, because it
should be quite different.

>  static struct iommu_domain *
>  amd_iommu_domain_alloc_user(struct device *dev, u32 flags,
>  			    struct iommu_domain *parent,
>  			    const struct iommu_user_data *user_data)
> -
>  {
> +	struct iommu_domain *dom;
> +	struct iommu_dev_data *dev_data;
>  	unsigned int type = IOMMU_DOMAIN_UNMANAGED;
> +	bool nested_parent = flags & IOMMU_HWPT_ALLOC_NEST_PARENT;
> +
> +	if (parent) {
> +		int ret;
> +		struct iommu_hwpt_amd_v2 hwpt;
> +
> +		if (parent->ops != amd_iommu_ops.default_domain_ops)
> +			return ERR_PTR(-EINVAL);
> +
> +		ret = udata_to_iommu_hwpt_amd_v2(user_data, &hwpt);
> +		if (ret)
> +			return ERR_PTR(ret);
>  
> -	if ((flags & ~IOMMU_HWPT_ALLOC_DIRTY_TRACKING) || parent || user_data)
> +		return amd_iommu_nested_domain_alloc(dev, type, flags,
> +						     &hwpt, parent);
> +	}
> +
> +	/* Check supported flags */
> +	if ((flags & ~amd_iommu_hwpt_supported_flags) ||
> +	    !check_nested_support(flags))
>  		return ERR_PTR(-EOPNOTSUPP);
>  
> -	return do_iommu_domain_alloc(type, dev, flags);
> +	dev_data = dev_iommu_priv_get(dev);
> +
> +	/*
> +	 * When allocated nested parent domain, the device may already
> +	 * have been attached to a domain. For example, a device is already
> +	 * attached to the domain allocated by VFIO, which contains GPA->SPA mapping.
> +	 * In such case, return reference to the same domain.
> +	 */

What? No! This would break all the lifecycle model. Domain allocation must
always allocate. qemu needs to allocate the nested partent domain type
from the very start.

> +static int nested_gcr3_update(struct iommu_hwpt_amd_v2 *hwpt,
> +			      struct protection_domain *pdom,
> +			      struct protection_domain *ppdom,
> +			      struct device *dev)
> +{
> +	struct pci_dev *pdev;
> +	struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> +
> +	pdev = to_pci_dev(dev);
> +	if (!pdev)
> +		return -EINVAL;
> +
> +	/* Note: Currently only support GCR3TRPMode with nested translation */
> +	if (!check_feature2(FEATURE_GCR3TRPMODE))
> +		return -EOPNOTSUPP;
> +
> +	pdom->parent = ppdom;
> +	pdom->guest_domain_id = hwpt->gdom_id;
> +	pdom->guest_paging_mode = hwpt->flags.guest_paging_mode;
> +
> +	dev_data->gcr3_info.trp_gpa = hwpt->gcr3;
> +	dev_data->gcr3_info.glx = hwpt->flags.glx;
> +	dev_data->gcr3_info.giov = hwpt->flags.giov;

Here again, this is just data copied from the vDTE to the pDTE - the
guest's gcr3 related parameters are just bits being flowed through

And as before "alloc" shouldn't touch anything outside the
iommu_domain being allocated, touching dev_data here is completely
wrong.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ