lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240416113937.GS223006@ziepe.ca>
Date: Tue, 16 Apr 2024 08:39:37 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Vasant Hegde <vashegde@....com>
Cc: Eric Wagner <ewagner12@...il.com>, Robin Murphy <robin.murphy@....com>,
	Vasant Hegde <vasant.hegde@....com>, Joerg Roedel <joro@...tes.org>,
	Will Deacon <will@...nel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
	iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: Kernel 6.7 regression doesn't boot if using AMD eGPU

On Tue, Apr 16, 2024 at 04:23:38PM +0530, Vasant Hegde wrote:
> >     Yes, if the driver wants to force a domain type it should be
> >     forced. Driver knows best. Eg AMD forces IDENTITY when the HW/driver
> >     is incapable of supporting otherwise.
> 
> 
> @Jason,
> 
> Looks like before commit 59ddce4418da483, core IOMMU layer was enforcing
> 'IOMMU_DOMAIN_DMA' for untrusted device. If its trusted device then it was
> letting HW IOMMU driver to pick best domain type.

If the driver wants to force identity because paging doesn't work then
yes it needs to figure something out..

Really the drivers should not be designed to do this, they need to
accommodate paging domains in all cases if things are going to work
correctly. The def_domain callback should be a last resort.

> >     diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> >     index d35c1b8c8e65ce..f3da6a5b6cb1cb 100644
> >     --- a/drivers/iommu/amd/iommu.c
> >     +++ b/drivers/iommu/amd/iommu.c
> >     @@ -2758,11 +2758,16 @@ static int amd_iommu_def_domain_type(struct device *dev)
> >               *    and require remapping.
> >               *  - SNP is enabled, because it prohibits DTE[Mode]=0.
> >               */
> >     -       if (pdev_pasid_supported(dev_data) &&
> >     -           !cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
> >     -           !amd_iommu_snp_en) {
> >     +       if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT) && !amd_iommu_snp_en)
> >     +               return IOMMU_DOMAIN_IDENTITY;
> >     +
> >     +       /*
> >     +        * For now driver limitations prevent enabling PASID as a paging domain
> >     +        * on the RID together.
> >     +        */
> >     +       if (dev_is_pci(dev) && !to_pci_dev(dev)->untrusted &&
> >     +           pdev_pasid_supported(dev_data))
> >                      return IOMMU_DOMAIN_IDENTITY;
> >     -       }
> > 
> >              return 0;
> >       }
> > 
> > 
> > As it booted ok with Robin's patch above, these changes to
> > drivers/iommu/amd/iommu.c didn't seem to make a difference for me. I was
> > testing with amd iommu v2 off in the kernel config and I also have TSME
> > enabled in the BIOS if that matters.

There must be a mistake in the above then, it would be good to sort it
out because something like that is the right fix.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ