[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170328202844.GQ8329@suse.de>
Date:   Tue, 28 Mar 2017 22:28:44 +0200
From:   Joerg Roedel <jroedel@...e.de>
To:     "Deucher, Alexander" <Alexander.Deucher@....com>
Cc:     'Joerg Roedel' <joro@...tes.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Daniel Drake <drake@...lessm.com>,
        "Nath, Arindam" <Arindam.Nath@....com>
Subject: Re: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
On Tue, Mar 28, 2017 at 08:18:26PM +0000, Deucher, Alexander wrote:
> > -----Original Message-----
> > From: Joerg Roedel [mailto:joro@...tes.org]
> > Sent: Tuesday, March 28, 2017 8:17 AM
> > To: Bjorn Helgaas
> > Cc: linux-pci@...r.kernel.org; linux-kernel@...r.kernel.org; Joerg Roedel;
> > Daniel Drake; Deucher, Alexander
> > Subject: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
> > 
> > From: Joerg Roedel <jroedel@...e.de>
> > 
> > ATS is broken on these devices. Under invalidation load, the
> > GPU does not reply to invalidations anymore, causing
> > Completion-wait loop timeouts on the AMD IOMMU driver side.
> > Fix it by not enabling ATS on these devices.
> > 
> > Note that below mentioned commit is not broken, it just
> > triggers the issue because it might cause invalidation
> > storms on devices.
> > 
> > Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
> > Reported-by: Daniel Drake <drake@...lessm.com>
> > Cc: Daniel Drake <drake@...lessm.com>
> > Cc: Alexander Deucher <Alexander.Deucher@....com>
> > Signed-off-by: Joerg Roedel <jroedel@...e.de>
> 
> Did you see Arindam's patch from yesterday[1]?  Not sure which is the proper fix, maybe both?
Arindam's patch makes sense on its own, but not as a fix for this issue.
It lowers the invalidation load on the GPU, but there are still ways to
trigger a high invalidation rate on the device. So it might hide the
issue, but not fix it.
We need to disable ATS on the device if it doesn't work reliably.
	Joerg
Powered by blists - more mailing lists
 
