[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN6PR12MB1652773FC29C98A899FD5689F7350@BN6PR12MB1652.namprd12.prod.outlook.com>
Date: Wed, 29 Mar 2017 16:21:08 +0000
From: "Deucher, Alexander" <Alexander.Deucher@....com>
To: 'Joerg Roedel' <jroedel@...e.de>
CC: 'Joerg Roedel' <joro@...tes.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Daniel Drake <drake@...lessm.com>,
"Nath, Arindam" <Arindam.Nath@....com>
Subject: RE: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
> -----Original Message-----
> From: 'Joerg Roedel' [mailto:jroedel@...e.de]
> Sent: Tuesday, March 28, 2017 6:26 PM
> To: Deucher, Alexander
> Cc: 'Joerg Roedel'; Bjorn Helgaas; linux-pci@...r.kernel.org; linux-
> kernel@...r.kernel.org; Daniel Drake; Nath, Arindam
> Subject: Re: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
>
> On Tue, Mar 28, 2017 at 09:13:23PM +0000, Deucher, Alexander wrote:
> > If I understand Arindam's patch correctly, it only flushes TLB entries
> > for domains in the flush queue whereas the previous behavior was to
> > flush all domains. If there was no TLB flush in the queue for that
> > domain, could flushing it cause a problem?
>
> No, that can't cause a problem. An io/tlb flush for the device is just a
> message that the device should invalidate its own tlb. The device can't
> know and doesn't need to know whether the page-tables it used to fill
> the tlb really changed.
>
> As it looks, the problem we are seeing here is that we are sending a
> large amount of these requests to the GPU device, and wait for its
> completion every time. This shouldn't be a problem for ATS devices, but
> the GPU here seems to fail at some point and doesn't answer to the
> invalidation request anymore, causing the completion-wait loop timeouts.
>
> Arindam's patch makes the high flush-frequency less likely, but it can
> still happen, depending on how the GPU is used. So its the best to
> keep ATS disabled on the device as it doesn't work correctly and we risk
> running in the same problem again when we leave it enabled and just make
> the trigger less likely.
Thanks for clarifying. The patch is:
Acked-by: Alex Deucher <alexander.deucher@....com>
Powered by blists - more mailing lists