lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHudX3w63VPCjP5mxOePPosa+KkUw7eMRcJF=MJ-ktNO7CwXdg@mail.gmail.com>
Date: Tue, 16 Apr 2024 22:16:55 -0400
From: Eric Wagner <ewagner12@...il.com>
To: Vasant Hegde <vashegde@....com>
Cc: Jason Gunthorpe <jgg@...pe.ca>, Robin Murphy <robin.murphy@....com>, 
	Vasant Hegde <vasant.hegde@....com>, Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>, 
	Suravee Suthikulpanit <suravee.suthikulpanit@....com>, iommu@...ts.linux.dev, 
	linux-kernel@...r.kernel.org
Subject: Re: Kernel 6.7 regression doesn't boot if using AMD eGPU

On Tue, Apr 16, 2024 at 6:53 AM Vasant Hegde <vashegde@....com> wrote:
>
> Hi Eric,
>
>
> On 4/16/2024 7:42 AM, Eric Wagner wrote:
> > On Mon, Apr 15, 2024 at 5:44 PM Robin Murphy <robin.murphy@....com
> > <mailto:robin.murphy@....com>> wrote:
> >
> >     As a first step I'd test the quick hack below, but be prepared for things
> >     to still break slightly differently.
> >
> >     Cheers,
> >     Robin.
> >
> >     ----->8-----
> >     diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >     index 996e79dc582d..063e1eb32fbd 100644
> >     --- a/drivers/iommu/iommu.c
> >     +++ b/drivers/iommu/iommu.c
> >     @@ -1774,7 +1774,7 @@ static int iommu_get_default_domain_type(struct
> >     iommu_group *group,
> >                                      untrusted,
> >                                      "Device is not trusted, but driver is
> >     overriding group %u to %s,
> >     refusing to probe.\n",
> >                                      group->id, iommu_domain_type_str(driver_type));
> >     -                       return -1;
> >     +                       //return -1;
> >                      }
> >                      driver_type = IOMMU_DOMAIN_DMA;
> >              }
> >
> > This worked and got the system booting for me.
>
> Thanks for testing.
>
> IIUC eGPU is behind Thunderbolt  and hence IOMMU treated it as 'untrusted device'.
>
> AMD driver tries to allocate "IDENTITY" domain for GPU devices.
> iommu_get_default_domain_type() return -1 as it expects IOMMU_DOMAIN_DMA for
> untrusted device.
>
> Can you please attach lspci -vvv output? Also /proc/cmdline output.
>
>
> >
> > On Mon, Apr 15, 2024 at 8:39 PM Jason Gunthorpe <jgg@...pe.ca
> > <mailto:jgg@...pe.ca>> wrote:
> >
> >     On Mon, Apr 15, 2024 at 10:44:34PM +0100, Robin Murphy wrote:
> >      > On 2024-04-15 7:57 pm, Eric Wagner wrote:
> >      > > Apologies if I made a mistake in the first bisect, I'm new to kernel
> >      > > debugging.
> >      > >
> >      > > I tested cedc811c76778bdef91d405717acee0de54d8db5 (x86/amd) and
> >      > > 3613047280ec42a4e1350fdc1a6dd161ff4008cc (core) directly and both were
> >     good.
> >      > > Then I ran git bisect again with e8cca466a84a75f8ff2a7a31173c99ee6d1c59d2
> >      > > as the bad and 6e6c6d6bc6c96c2477ddfea24a121eb5ee12b7a3 as the good and the
> >      > > bisect log is attached. It ended up at the same commit as before.
> >      > >
> >      > > I've also attached a picture of the boot screen that occurs when it hangs.
> >      > > 0000:05:00.0 is the PCIe bus address of the RX 580 eGPU that's causing the
> >      > > problem.
> >      >
> >      > Looks like 59ddce4418da483 probably broke things most - prior to that, the
> >      > fact that it's behind a Thunderbolt port would have always taken precedence
> >      > and forced IOMMU_DOMAIN_DMA regardless of what the driver may have wanted to
> >      > saywhereas now we ask the driver first, then complain that it conflicts
> >      > with the untrusted status and ultimately don't configure the IOMMU
> >      > at all.
> >
> >     Yes, if the driver wants to force a domain type it should be
> >     forced. Driver knows best. Eg AMD forces IDENTITY when the HW/driver
> >     is incapable of supporting otherwise.
>
>
> @Jason,
>
> Looks like before commit 59ddce4418da483, core IOMMU layer was enforcing
> 'IOMMU_DOMAIN_DMA' for untrusted device. If its trusted device then it was
> letting HW IOMMU driver to pick best domain type.
>
> Did we change that flow? Are we expecting HW IOMMU driver to handler untrusted
> devices as well?
>
> >
> >      > Meanwhile the GPU driver presumably goes on to believe it's using dma-direct
> >      > with no IOMMU present, resulting in fireworks when its traffic reaches the
> >      > IOMMU. Great :(
> >
> >     I wonder where is the missing error handling.. iommu probe failure
> >     should not go on to allow driver attach, there is no guarentee any DMA
> >     works now that many iommus are booting up in BLOCKED.
>
>
> Looking into code path, in failure path we cleanup device, not group. May be
> that's causing issue? Not sure where it failed. If I manage to find some system
> I will try to debug further.
>
>
>
> >
> >      > However the other notable thing that also happened between 6.6 and 6.7 was
> >      > the removal of the AMD iommu_v2 code, so there's some possibility that the
> >      > GPU driver still may have only been working before due to that also
> >
> >     Most likely it is the above change interacting with this patch when
> >     they are both combined in the merge:
> >
> >     commit 92e2bd56a5f9fc44313fda802a43a63cc2a9c8f6
> >     Author: Vasant Hegde <vasant.hegde@....com <mailto:vasant.hegde@...com>>
> >     Date:   Thu Sep 21 09:21:45 2023 +0000
> >
> >          iommu/amd: Introduce iommu_dev_data.flags to track device capabilities
> >
> >     @@ -2471,7 +2481,7 @@ static int amd_iommu_def_domain_type(struct device *dev)
> >               *    and require remapping.
> >               *  - SNP is enabled, because it prohibits DTE[Mode]=0.
> >               */
> >     -       if (dev_data->iommu_v2 &&
> >     +       if (pdev_pasid_supported(dev_data) &&
> >                  !cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
> >                  !amd_iommu_snp_en) {
> >                      return IOMMU_DOMAIN_IDENTITY;
> >
> >     Which, IIRC, was intended to be temporary to work around limitations
> >     in the DTE programming logic within the driver. Previously iommu_v2 as
> >     a module option that Eric probably doesn't set, I guess.
> >
> >     The below will probably make it boot, but Vasant should check what
> >     happens if PASID is eventually attached too.
> >
> >     diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> >     index d35c1b8c8e65ce..f3da6a5b6cb1cb 100644
> >     --- a/drivers/iommu/amd/iommu.c
> >     +++ b/drivers/iommu/amd/iommu.c
> >     @@ -2758,11 +2758,16 @@ static int amd_iommu_def_domain_type(struct device *dev)
> >               *    and require remapping.
> >               *  - SNP is enabled, because it prohibits DTE[Mode]=0.
> >               */
> >     -       if (pdev_pasid_supported(dev_data) &&
> >     -           !cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
> >     -           !amd_iommu_snp_en) {
> >     +       if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT) && !amd_iommu_snp_en)
> >     +               return IOMMU_DOMAIN_IDENTITY;
> >     +
> >     +       /*
> >     +        * For now driver limitations prevent enabling PASID as a paging domain
> >     +        * on the RID together.
> >     +        */
> >     +       if (dev_is_pci(dev) && !to_pci_dev(dev)->untrusted &&
> >     +           pdev_pasid_supported(dev_data))
> >                      return IOMMU_DOMAIN_IDENTITY;
> >     -       }
> >
> >              return 0;
> >       }
> >
> >     Jason
> >
> >
> > As it booted ok with Robin's patch above, these changes to
> > drivers/iommu/amd/iommu.c didn't seem to make a difference for me. I was testing
> > with amd iommu v2 off in the kernel config and I also have TSME enabled in the
> > BIOS if that matters.
>
> TMSE is transparent to OS. So its fine.
>
> -Vasant
>
Output of lspci -vvv and /proc/cmdline attached.

-Eric

View attachment "cmdline.log" of type "text/x-log" (83 bytes)

View attachment "lspci.log" of type "text/x-log" (84184 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ