linux-kernel - Re: data corruption with nvidia chipsets and IDE/SATA drives (k8 cpu errata needed?)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-id: <45EBB7C5.2010407@shaw.ca>
Date:	Mon, 05 Mar 2007 00:25:09 -0600
From:	Robert Hancock <hancockr@...w.ca>
To:	linux-kernel@...r.kernel.org
Cc:	Chip Coldwell <coldwell@...hat.com>, Andi Kleen <ak@...e.de>
Subject: Re: data corruption with nvidia chipsets and IDE/SATA drives (k8 cpu
 errata needed?)

Chip Coldwell wrote:
> On Wed, 17 Jan 2007, Andi Kleen wrote:
> 
>> On Wednesday 17 January 2007 07:31, Chris Wedgwood wrote:
>>> On Tue, Jan 16, 2007 at 08:52:32PM +0100, Christoph Anton Mitterer wrote:
>>>> I agree,... it seems drastic, but this is the only really secure
>>>> solution.
>>> I'd like to here from Andi how he feels about this?  It seems like a
>>> somewhat drastic solution in some ways given a lot of hardware doesn't
>>> seem to be affected (or maybe in those cases it's just really hard to
>>> hit, I don't know).
>> AMD is looking at the issue. Only Nvidia chipsets seem to be affected,
>> although there were similar problems on VIA in the past too.
>> Unless a good workaround comes around soon I'll probably default
>> to iommu=soft on Nvidia.
> 
> We (Sun, AMD, Nvidia and Red Hat) have been testing a patch that seems
> to solve the problem.  AMD and Nvidia analyzed an HDT trace that
> seemed to indicate that CPU updates of the GATT were still in cache
> when a subsequent table walk caused by a device load used a stale GATT
> PTE.  That analysis inspired this patch, submitted to this list as an
> RFC.  It is not obvious (to me, at least) why this problem has only
> shown up on Nvidia SATA controllers.
> 
> We are continuing to investigate.
> 
> diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c
> index 030eb37..1dd461a 100644
> --- a/arch/x86_64/kernel/pci-gart.c
> +++ b/arch/x86_64/kernel/pci-gart.c
> @@ -69,6 +69,8 @@ static u32 gart_unmapped_entry;
>  #define AGPEXTERN
>  #endif
>  
> +#define GATT_CLFLUSH(i) asm volatile ("clflush (%0)" :: "r" (iommu_gatt_base + (i)))
> +
>  /* backdoor interface to AGP driver */
>  AGPEXTERN int agp_memory_reserved;
>  AGPEXTERN __u32 *agp_gatt_table;
> @@ -221,6 +223,7 @@ static dma_addr_t dma_map_area(struct device *dev, dma_addr_t phys_mem,
>  	for (i = 0; i < npages; i++) {
>  		iommu_gatt_base[iommu_page + i] = GPTE_ENCODE(phys_mem);
>  		SET_LEAK(iommu_page + i);
> +		GATT_CLFLUSH(iommu_page + i);
>  		phys_mem += PAGE_SIZE;
>  	}
>  	return iommu_bus_base + iommu_page*PAGE_SIZE + (phys_mem & ~PAGE_MASK);
> @@ -348,6 +351,7 @@ static int __dma_map_cont(struct scatterlist *sg, int start, int stopat,
>  		while (pages--) { 
>  			iommu_gatt_base[iommu_page] = GPTE_ENCODE(addr); 
>  			SET_LEAK(iommu_page);
> +			GATT_CLFLUSH(iommu_page);
>  			addr += PAGE_SIZE;
>  			iommu_page++;
>  		}
> 
> 

Andi, have you had a look at this? I'm a bit surprised at the lack of 
reaction to this find..

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@...pamshaw.ca
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/