lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B2AAC87.5000703@knaff.lu>
Date:	Thu, 17 Dec 2009 23:11:19 +0100
From:	Alain Knaff <alain@...ff.lu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	markh@...pro.net, fdutils@...tils.linux.lu,
	linux-kernel@...r.kernel.org
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils]
 Cannot format floppies under kernel 2.6.*?)

Linus Torvalds wrote:
> 
> On Thu, 17 Dec 2009, Alain Knaff wrote:
>> For the moment, I have a very small sample of hardware:
>> 1. One machine which works (my own): Athlon XP 1800+ processor
>> 2. One which doesn't work (Mark's)
> 
> Ok. I don't think I even have any machines with floppy drives any more 
> (one external USB drive somewhere gathering dust just in case I ever 
> encounter a floppy again).

Well, on my new box, I have no floppy drive either. The one I mentioned
is an old machine that I kept around just in case I needed to debug
floppy-related problems.

>> I might get access to a wider sample of boxen in a week or so, in order
>> to do some stats.
> 
> Ok, I was more thinking "we have a bugzilla with ten different people 
> reporting this". If it's just a single machine, that's not going to be 
> relevant.

We do have a bugzilla
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=548434 , but
unfortunately it has only 2 people so far having seen the bug, one of
which (ael) turned out to be a false alert (dusty drive).

> 
>> What's the easiest way to find out the chipset?
>>
>> Here's already the output of lspci from my machine (works):
>>
>> 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
> 
> Yeah, lspci (and generally only the northbridge and southbridge matters, 
> the "ISA bridge" might technically be relevant, but since it's universally 
> on the same die as the southbridge, I left it in there just for kicks).

Good. Here's some info about some machines of Mark which do have the
problem (there's more than one, fortunately):

1st one showing the problem (claimed to be AMD 790x chipset):

00:00.0 Host bridge: ATI Technologies Inc RD790 Northbridge only dual
slot PCI-e_GFX and HT3 K8 part
00:02.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge
(external gfx0 port A)
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller

2nd one showing the problem (also claimed to be AMD 790x chipset):

00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge
(int gfx)
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller

He also has several machines that do work:

1st one that does work:
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)

... and a couple more where he didn't get around to test.

[...]
> Only the "it doesn't work on xyz" is likely interesting. The machines it 
> works on are probably uninteresting statistically.

I understand... (working machine above just mentioned for completeness'
sake).

[...]
> You'd need a git tree that contains both the working and non-working 
> versions, and then literally just do
> 
> 	git bisect start
> 	git bisect good <known good version number here>
> 	git bisect bad <known bad version here>
> 
> and it will give you a commit to try. Compile, test, see if it's good or 
> bad, and do
> 
> 	git bisect [good|bad]
> 
> depending on the result. Rinse and repeat (depending on how tight the 
> initial good/bad commits were, it will need 10-15 kernel tests).

... and how do I check out the most recent good / oldest bad kernel for
compilation?


> So in this case, since apparently 2.6.27.41 is good, and 2.6.28 is not, it 
> would be something like this:
> 
> 	# clone hpa's tree that has all the stable releases in one place
> 	git clone git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-allstable.git
> 
> 	cd linux-2.6-allstable
> 	git bisect start
> 	git bisect bad v2.6.28
> 	git bisect good v2.6.27.41
> 
> and off you go.

ok...

> NOTE! Bisection depends very much on the bug being 100% reproducible. If 
> you ever mark a good kernel bad (because you messed up) or a bad kernel 
> good (because the bug wasn't 100% reproducible, so you _thought_ it was 
> good even though the bug was present and just happened to hide), the end 
> result of the bisect will be totally unreliable and seriously screwed up.
> 
> So after a successful bisect, it is usually a good idea to try to go back 
> to the original known-bad kernel, and then revert the commit that was 
> indicated as the bad one (assuming the revert works - it could be that the 
> bad one ends up being fundamental to other commits after it), and test 
> that yes, that really fixes the bug.

What command lines would I use for that revert?

> It gets more complicated if the bisect hits kernels that you can't test 
> because they have _unrelated_ issues on that machine (compile failures or 
> just other bugs that hide the actual floppy behavior), but generally 
> bisection is pretty simple. "man git-bisect" does have some extra 
> pointers.
> 
> So git bisect may be somewhat time-consuming and mindless, but for 
> reliably triggering bugs where nobody really knows what caused the bug it 
> is a _really_ convenient thing to do. The only thing you need is a 
> reliably triggering test-case, and some time.
> 
> 			Linus

Alain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ