lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ilmjqj1f.wl-tiwai@suse.de>
Date:   Tue, 23 Aug 2022 13:46:36 +0200
From:   Takashi Iwai <tiwai@...e.de>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Lu Baolu <baolu.lu@...ux.intel.com>,
        Joerg Roedel <jroedel@...e.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Robin Murphy <robin.murphy@....com>,
        Eric Auger <eric.auger@...hat.com>,
        regressions@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [REGRESSION 5.19.x] AMD HD-audio devices missing on 5.19

On Tue, 23 Aug 2022 08:06:05 +0200,
Takashi Iwai wrote:
> 
> On Tue, 23 Aug 2022 03:00:21 +0200,
> Jason Gunthorpe wrote:
> > 
> > On Mon, Aug 22, 2022 at 04:12:59PM +0200, Takashi Iwai wrote:
> > > Hi,
> > > 
> > > we've received regression reports about the missing HD-audio devices
> > > on AMD platforms, and this turned out to be caused by the commit
> > > 512881eacfa72c2136b27b9934b7b27504a9efc2
> > >     bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
> > > 
> > > The details are found in openSUSE bugzilla:
> > >   https://bugzilla.suse.com/show_bug.cgi?id=1202492
> > > 
> > > The problem seems to be that HD-audio (both onboard analog and HDMI)
> > > PCI devices are assigned to the same IOMMU group as AMD graphics PCI
> > > device, and once after the AMDGPU is initialized beforehand, those
> > > audio devices can't be probed since iommu_device_use_default_domain()
> > > returns -EBUSY.
> > 
> > Can you describe exactly what drivers are involved in this? If it is
> > the above commit then several devices are sharing an iommu group and
> > one of them (well, the only one already attached, I suppose) has made
> > the group unsharable.
> > 
> > With grep I don't see an obvious place where the AMDGPU driver would
> > mess with the iommu configuration, so I have no guess.
> 
> I have also no concrete clue, either :)
> At least, drivers/gpu/drm/amd/amdkfd/kfd_iommu.c calls
> amd_iommu_init_device(), and this invokes iommu_attach_group(), which
> may change group->domain.  But it was just my wild guess, and it might
> be others, indeed.
> 
> > It would be good to have some debugging to confirm if it is
> > group->owner (should be impossible, suggests memory corruption if it
> > is) or group->domain != group->default_domain.
> >
> > Most likely it is the later, but I can't see how that could happen on
> > a system like this.. There is no obvious manipulation in AMDGPU, for
> > instance.
> > 
> > So debugging to find the backtrace for exactly when 
> >  group->domain != group->default_domain
> > Occurs for the troubled group would be necessary.
> 
> OK, will try to build a test kernel with some debug prints and ask the
> reporters.  It may take some time.

It was tested now and confirmed that the call path is via AMDGPU, as
expected:
  amdgpu_pci_probe ->
  amdgpu_driver_load_kms ->
  amdgpu_device_init ->
  amdgpu_amdkfd_device_init ->
  kgd2kfd_device_init ->
  kgd2kfd_resume_iommu ->
  kfd_iommu_resume ->
  amd_iommu_init_device ->
  iommu_attach_group ->
  __iommu_attach_group

At first AMDGPU driver is probed, and the iommu_attach_group() call
above changes the assigned group->domain.  Afterwards, when HD-audio
devices are probed, it fails because:
- Both HD-audio PCI devices belong to the very same IOMMU group as the
  AMD graphics PCI device
- PCI core calls iommu_device_use_default_domain() and the check
  fails there because group->domain != group->default_domain

Takashi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ