lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 9 Jun 2007 11:12:06 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jeff Garzik <jeff@...zik.org>
cc:	Tejun Heo <htejun@...il.com>, Gregor Jasny <gjasny@...glemail.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-ide@...r.kernel.org, Alan Cox <alan@...rguk.ukuu.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] Re: Linux v2.6.22-rc3



On Thu, 7 Jun 2007, Jeff Garzik wrote:
>
> Ack'ing the sata_promise change was easy, but with this one it would
> be nice to wait a bit before changing the core probe code that [now]
> every ATA setup goes through, based on a single bug report.

So what's the resolution? Right now this is apparently the reasong for 
two of our current regressions, and we need to solve it.

"Just wait" is not an option. The options are:
 - fix things
 - revert everything that caused the regressions

and quite frankly, I don't think the second alternative is palatable to 
anybody, including you.

> The values assist in detecting ghost devices (same device appearing
> on both master and slave) and TF register malfunctions, and I would
> appreciate not breaking _that_ so late in 2.6.22-rc for a single
> report.  Thankfully we have -some- ghost device prevention code
> elsewhere, but this is part of it.

Apparently this isn't even true. As far as I can tell, the old code used 
to wait for lbal to be 1, but if it never became one, it would just 
silently time out and not *do* anything about it. Later commands would 
"just work", and the device would go its merry ways.

IOW, the new code is *crap*. The old code was arguably not wonderful 
either, but because of its nature, the old code not working didn't 
actually matter all that much.

So the biggest change (and the one that broke peoples machines) is that 
the code that you claim matters, *never* apparently did matter, and now 
that we've made it matter (by returning an error, and aborting the SRST 
sequence as a failure), people are complaining.

I really think we should apply Tejun's patch. Add a delay there, by all 
means if you are nervous: but no, it shouldn't be 150ms. This is the 
_post_ reset handling, we've already done the 150ms wait for the reset!

In fact, if you look at ata_bus_post_reset(), you'll notice that for 
port0, we just do the "ata_wait_ready()" call. Tejun's patch just made us 
do the same (logical) thing for port1 too!  If you think it breaks due to 
ome timeout issue, then the port0 thing was already broken.

(And maybe we could make the 

	ap->ops->dev_select(ap, 1);
	rc = ata_wait_ready(ap, deadline);

instead be a

	ata_dev_select(ap, 1, 1, 1);

and you'd wait even more? But that's actually a bigger change than Tejun's 
thing, and it uses "ata_wait_idle()" rather than "ata_wait_ready()", and 
doesn't return any status.. I dunno. But right now, we have several known 
broken things, that apparently weren't broken before!)

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ