lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0912171256560.15740@localhost.localdomain>
Date:	Thu, 17 Dec 2009 13:14:43 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Alain Knaff <alain@...ff.lu>
cc:	markh@...pro.net, fdutils@...tils.linux.lu,
	linux-kernel@...r.kernel.org
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils]
 Cannot format floppies under kernel 2.6.*?)



On Thu, 17 Dec 2009, Alain Knaff wrote:
> 
> For the moment, I have a very small sample of hardware:
> 1. One machine which works (my own): Athlon XP 1800+ processor
> 2. One which doesn't work (Mark's)

Ok. I don't think I even have any machines with floppy drives any more 
(one external USB drive somewhere gathering dust just in case I ever 
encounter a floppy again).

> I might get access to a wider sample of boxen in a week or so, in order
> to do some stats.

Ok, I was more thinking "we have a bugzilla with ten different people 
reporting this". If it's just a single machine, that's not going to be 
relevant.

> What's the easiest way to find out the chipset?
> 
> Here's already the output of lspci from my machine (works):
> 
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge

Yeah, lspci (and generally only the northbridge and southbridge matters, 
the "ISA bridge" might technically be relevant, but since it's universally 
on the same die as the southbridge, I left it in there just for kicks).

> (It happens during formatting the floppy drive: here the first byte
> happens to be the trackid of the first physical sector of the track, and
> it always ends up being the track of the *previously* formatted track).

I guess it could simply be a floppy controller bug too, triggered by some 
random timing difference or innocuous-looking change.

> > But I think we'd like to see a list of hardware where this can be 
> > triggered,
> 
> We'll get a list of 2 machines relatively quickly (unless other people
> would like to chime in: the test is easy, just fdformat a floppy disk),
> and more in a week or so.

Only the "it doesn't work on xyz" is likely interesting. The machines it 
works on are probably uninteresting statistically.

> > and quite frankly, a 'git bisect' would be absolutely wonderful 
> 
> How exactly would I use this (command line sample)?

You'd need a git tree that contains both the working and non-working 
versions, and then literally just do

	git bisect start
	git bisect good <known good version number here>
	git bisect bad <known bad version here>

and it will give you a commit to try. Compile, test, see if it's good or 
bad, and do

	git bisect [good|bad]

depending on the result. Rinse and repeat (depending on how tight the 
initial good/bad commits were, it will need 10-15 kernel tests).

So in this case, since apparently 2.6.27.41 is good, and 2.6.28 is not, it 
would be something like this:

	# clone hpa's tree that has all the stable releases in one place
	git clone git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-allstable.git

	cd linux-2.6-allstable
	git bisect start
	git bisect bad v2.6.28
	git bisect good v2.6.27.41

and off you go.

NOTE! Bisection depends very much on the bug being 100% reproducible. If 
you ever mark a good kernel bad (because you messed up) or a bad kernel 
good (because the bug wasn't 100% reproducible, so you _thought_ it was 
good even though the bug was present and just happened to hide), the end 
result of the bisect will be totally unreliable and seriously screwed up.

So after a successful bisect, it is usually a good idea to try to go back 
to the original known-bad kernel, and then revert the commit that was 
indicated as the bad one (assuming the revert works - it could be that the 
bad one ends up being fundamental to other commits after it), and test 
that yes, that really fixes the bug.

It gets more complicated if the bisect hits kernels that you can't test 
because they have _unrelated_ issues on that machine (compile failures or 
just other bugs that hide the actual floppy behavior), but generally 
bisection is pretty simple. "man git-bisect" does have some extra 
pointers.

So git bisect may be somewhat time-consuming and mindless, but for 
reliably triggering bugs where nobody really knows what caused the bug it 
is a _really_ convenient thing to do. The only thing you need is a 
reliably triggering test-case, and some time.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ