lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 3 Apr 2017 23:39:09 +0200
From:   Joerg Roedel <joro@...tes.org>
To:     Samuel Sieb <samuel@...b.net>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: AMD IOMMU causing filesystem corruption

Hi Samuel,

On Mon, Apr 03, 2017 at 01:38:08PM -0700, Samuel Sieb wrote:
> I filed a bug in bugzilla, but I wasn't sure what category to put it
> in, so I suspect I ended up picking one that doesn't get looked at
> much.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=195051
> 
> The issue is that on a specific Acer laptop with a dual-core A9, if
> I don't disable the IOMMU using iommu=off, it has immediate and
> rapidly fatal filesystem corruption by the time a user logs into the
> desktop. What led me to try that was at one point I noticed an error
> message about the iommu in the logs.  However, I did not have a
> chance to save that due to the corruption obliterating the log
> files.

You have a system based on the AMD Stoney platform, on which the PCI-ATS
feature of the GPU is broken, as we recently found out.

Can you please test whether the attached patch fixes the issue on your
machine?

>From 09cbdcbbd23f0823e7651b4f35b13ae633b3fbe2 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@...e.de>
Date: Tue, 28 Mar 2017 13:20:27 +0200
Subject: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS

ATS is broken on these devices. Under invalidation load, the
GPU does not reply to invalidations anymore, causing
Completion-wait loop timeouts on the AMD IOMMU driver side.
Fix it by not enabling ATS on these devices.

Note that below mentioned commit is not broken, it just
triggers the issue because it might cause invalidation
storms on devices.

Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
Reported-by: Daniel Drake <drake@...lessm.com>
Cc: Alexander Deucher <Alexander.Deucher@....com>
Signed-off-by: Joerg Roedel <jroedel@...e.de>
---
 drivers/pci/ats.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index eeb9fb2..711bdb2 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -17,10 +17,18 @@
 
 #include "pci.h"
 
+static const struct pci_device_id broken_ats_tbl[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x98e4) }, /* AMD Stoney GPU part */
+	{ 0 }
+};
+
 void pci_ats_init(struct pci_dev *dev)
 {
 	int pos;
 
+	if (pci_match_id(broken_ats_tbl, dev))
+		return;
+
 	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
 	if (!pos)
 		return;
-- 
1.9.1

Powered by blists - more mailing lists