lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B2A98BB.5080406@knaff.lu>
Date:	Thu, 17 Dec 2009 21:46:51 +0100
From:	Alain Knaff <alain@...ff.lu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	markh@...pro.net, fdutils@...tils.linux.lu,
	linux-kernel@...r.kernel.org
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils]
 Cannot format floppies under kernel 2.6.*?)

Linus Torvalds wrote:
> 
> On Thu, 17 Dec 2009, Alain Knaff wrote:
>> 1. initial contents:  33 44 55 66
>> 2. one DMA transfer is performed
>> 3. program changes buffer to: 77 88 99 aa
>> 4. new DMA transfer is performed => instead it transmits 33 88 99 aa
>>    (i.e. first byte is from previous contents)
>>
>> This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
>> all hardware though.
> 
> Do you have a list of hardware it works on? Especially chipsets.

For the moment, I have a very small sample of hardware:
1. One machine which works (my own): Athlon XP 1800+ processor
2. One which doesn't work (Mark's)

I might get access to a wider sample of boxen in a week or so, in order
to do some stats.

What's the easiest way to find out the chipset?

Here's already the output of lspci from my machine (works):

00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:11.5 Multimedia audio controller: VIA Technologies, Inc.
VT8233/A/8235/8237 AC97 Audio Controller (rev 50)
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II]
(rev 74)
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX
440] (rev a3)


[...]
> I'm not entirely surprised. Actual CPU bugs are pretty rare in the x86 
> world. But chipset bugs? Another thing entirely. There are buffers and 
> caches there, and those are sometimes software-visible. The most obvious 
> case of that is just the IOMMU's themselves, but from your description I 
> don't think you actually change the DMA _mappings_ do you? Just the 
> actual buffer (that was then mapped earlier)?

No, I don't change any DMA mappings. And the buffer is still the same
physical buffer, at the same physical address.

(It happens during formatting the floppy drive: here the first byte
happens to be the trackid of the first physical sector of the track, and
it always ends up being the track of the *previously* formatted track).

> So I don't think it's the IOMMU code itself necessarily, although an IOMMU 
> may well be involved (eg I could easily see a few cachelines worth of 
> actual DMA data caching going on in the whole IOMMU too)
> 
> And to some degree the floppy driver might be _more_ likely to see some 
> kinds of bugs, because it uses that crazy legacy DMA engine. So it's not 

Indeed, most other drivers use "bus master" DMA, that doesn't use the
legacy DMA controller at all, but use DMA controllers hosted on the
device itself...

> going to go through the regular PCI DMA hardware paths, it's going to go 
> through its own special paths that nobody else uses any more (and thus has 
> probably not had as much testing).
> 
>> In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
>> arch/x86/kernel/amd_iommu.c and related files, could any of these have
>> triggered this behavior?
> 
> Could it have triggered? Sure. Chipset caches are often flushed by certain 
> trivial operations (often the caches are small, and operations like "any 
> PIO access" will make sure they are flushed). Different IOMMU flush 
> patterns could easily account for it.
> 
> But I think we'd like to see a list of hardware where this can be 
> triggered,

We'll get a list of 2 machines relatively quickly (unless other people
would like to chime in: the test is easy, just fdformat a floppy disk),
and more in a week or so.

> and quite frankly, a 'git bisect' would be absolutely wonderful 

How exactly would I use this (command line sample)?

> especially if the list of hardware is not showing any really obvious 
> patterns (and I assume they aren't all _that_ obvious, or you'd have 
> mentioned them).
> 
> 			Linus

Thanks,

Alain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ