lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 17 Dec 2009 18:00:04 +0100
From:	Alain Knaff <alain@...ff.lu>
To:	markh@...pro.net
CC:	fdutils@...tils.linux.lu, torvalds@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils]
 Cannot format floppies under kernel 2.6.*?)

On 17/12/09 16:49, Alain Knaff wrote:
> On 17/12/09 16:43, Mark Hounschell wrote:
>> On 12/17/2009 10:35 AM, Alain Knaff wrote:
>>
>>>> Should I do more work in between?
>>>
>>> No, but make sure to look at track 0... Other tracks will still have the
>>> error, as there was nothing forcing a memory flush between track 0 and 1...
>>
>> Ok track 0
> [...]
>> 0: 0
>> 1: 0
>> 2: 0
>> 3: 4f  <--
>> 4: 0 
>> 5: 1 
>> 6: 2 
>> no disk change
> 
> Yeah, that's what I meant... So the memory flusher program didn't manage to
> clear up the inconsistency...
> 
> So either my theory is wrong, or the memory flusher program was not
> efficient enough.... hmmm, maybe doing some surfing in between the formats,
> or doing another kernel compilation might be a better test.
> 
> Alain

Ok, so I had a look at the differences between 2.6.27.41 and 2.6.28, and
there have indeed been changes to the iommu and DMA handling code.

So I suspect that the problem may be lying here

Cc'ed Linus and kernel list on this. For Linux and the list, here's the
summary of what we are observing:

- A DMA transfer of a memory block transfers the wrong value for the first
byte of the block. All other bytes of the block are transferred correctly.
The value of the first byte turns out to be the value that this byte held
during the *previous* transfer. Just as if there was some kind of cache,
and the transfer started before that cache was refreshed with the new
values from main memory.

Example:

1. initial contents:  33 44 55 66
2. one DMA transfer is performed
3. program changes buffer to: 77 88 99 aa
4. new DMA transfer is performed => instead it transmits 33 88 99 aa
   (i.e. first byte is from previous contents)

This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
all hardware though.

It does indeed seem to be related to a DMA-side cache (rather than the
processor's cache not being flushed to main memory), as doing lots of
memory intensive work (kernel compilation) between 2 and 3 doesn't fix the
problem.

In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
arch/x86/kernel/amd_iommu.c and related files, could any of these have
triggered this behavior?

Any ideas, anybody?

Alain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ