[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-id: <45EBB7C5.2010407@shaw.ca>
Date: Mon, 05 Mar 2007 00:25:09 -0600
From: Robert Hancock <hancockr@...w.ca>
To: linux-kernel@...r.kernel.org
Cc: Chip Coldwell <coldwell@...hat.com>, Andi Kleen <ak@...e.de>
Subject: Re: data corruption with nvidia chipsets and IDE/SATA drives (k8 cpu
errata needed?)
Chip Coldwell wrote:
> On Wed, 17 Jan 2007, Andi Kleen wrote:
>
>> On Wednesday 17 January 2007 07:31, Chris Wedgwood wrote:
>>> On Tue, Jan 16, 2007 at 08:52:32PM +0100, Christoph Anton Mitterer wrote:
>>>> I agree,... it seems drastic, but this is the only really secure
>>>> solution.
>>> I'd like to here from Andi how he feels about this? It seems like a
>>> somewhat drastic solution in some ways given a lot of hardware doesn't
>>> seem to be affected (or maybe in those cases it's just really hard to
>>> hit, I don't know).
>> AMD is looking at the issue. Only Nvidia chipsets seem to be affected,
>> although there were similar problems on VIA in the past too.
>> Unless a good workaround comes around soon I'll probably default
>> to iommu=soft on Nvidia.
>
> We (Sun, AMD, Nvidia and Red Hat) have been testing a patch that seems
> to solve the problem. AMD and Nvidia analyzed an HDT trace that
> seemed to indicate that CPU updates of the GATT were still in cache
> when a subsequent table walk caused by a device load used a stale GATT
> PTE. That analysis inspired this patch, submitted to this list as an
> RFC. It is not obvious (to me, at least) why this problem has only
> shown up on Nvidia SATA controllers.
>
> We are continuing to investigate.
>
> diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c
> index 030eb37..1dd461a 100644
> --- a/arch/x86_64/kernel/pci-gart.c
> +++ b/arch/x86_64/kernel/pci-gart.c
> @@ -69,6 +69,8 @@ static u32 gart_unmapped_entry;
> #define AGPEXTERN
> #endif
>
> +#define GATT_CLFLUSH(i) asm volatile ("clflush (%0)" :: "r" (iommu_gatt_base + (i)))
> +
> /* backdoor interface to AGP driver */
> AGPEXTERN int agp_memory_reserved;
> AGPEXTERN __u32 *agp_gatt_table;
> @@ -221,6 +223,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
> for (i = 0; i < npages; i++) {
> iommu_gatt_base[iommu_page + i] = GPTE_ENCODE(phys_mem);
> SET_LEAK(iommu_page + i);
> + GATT_CLFLUSH(iommu_page + i);
> phys_mem += PAGE_SIZE;
> }
> return iommu_bus_base + iommu_page*PAGE_SIZE + (phys_mem & ~PAGE_MASK);
> @@ -348,6 +351,7 @@ static int __dma_map_cont(struct scatterlist *sg, int start, int stopat,
> while (pages--) {
> iommu_gatt_base[iommu_page] = GPTE_ENCODE(addr);
> SET_LEAK(iommu_page);
> + GATT_CLFLUSH(iommu_page);
> addr += PAGE_SIZE;
> iommu_page++;
> }
>
>
Andi, have you had a look at this? I'm a bit surprised at the lack of
reaction to this find..
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@...pamshaw.ca
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists