[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0707161327470.4941@kai.makisara.local>
Date: Mon, 16 Jul 2007 13:48:28 +0300 (EEST)
From: Kai Makisara <Kai.Makisara@...umbus.fi>
To: Bruce Allen <ballen@...vity.phys.uwm.edu>
cc: Mark Lord <liml@....ca>, David Greaves <david@...eaves.com>,
Douglas Gilbert <dougg@...que.net>,
Tejun Heo <htejun@...il.com>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Jeff Garzik <jeff@...zik.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Jan Dvorak <fermentol@...il.com>,
Klaus Fuerstberger <kfuerstberger@...or.de>,
Bruce Allen <bruce.allen@....mpg.de>,
Robert Hancock <hancockr@...w.ca>
Subject: Re: SMART problems in 2.6.22
On Tue, 10 Jul 2007, Kai Makisara wrote:
> On Sun, 8 Jul 2007, Bruce Allen wrote:
>
> > Mark, David, Doug, Tejin, Alan, Jeff, LKML,
> >
> > I'm afraid that there may be some problem with SMART + libata in the 2.6.22
> > kernel. An hour ago I discovered that I missed a month of correspondence
> > (some LKML, some private) about this problem which Alan, Tejun, Jeff, Mark and
> > others copied to me -- it was automatically shoved into one of my mailboxes by
> > my mail client. Sorry about that. So I am trying to catch up to see if there
> > is some real problem or not.
> >
...
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0706.1/0849.html
>
> I have done some more debugging on this one. An easy way to reproduce the
> problem is to use 'smartctl -H /dev/sdb'. If I enable debugging with '-r
> ioctl,2', I find the following difference between outputs using 2.6.21.1
> (works OK) and 2.6.22 (fails):
>
...
> The log shows that the sense data returned by the commands differ: with
> 2.6.22 the bytes 4f and 2c (tf.lbam and tf.lbah) are not returned. Both of
> the status commands fail to return these bytes but the tests in smartctl
> are more strict for the second case. This is why the second status command
> seems to be failing.
>
> Next I added printks to the function ata_qc_complete() in libata-core.c.
> The changed code from 2.6.22 at line 5222 looked like this:
>
...
> The output from 2.6.21.6 looks like this:
>
> Jul 9 18:37:44 kai kernel: [ 193.443874] ata_qc_complete before: 00 00 00 40
> Jul 9 18:37:44 kai kernel: [ 193.443880] ata_qc_complete 16: 00 4f c2 50
> Jul 9 18:37:44 kai kernel: [ 193.462802] ata_qc_complete before: 00 4f c2 40
> Jul 9 18:37:44 kai kernel: [ 193.462807] ata_qc_complete 16: 00 4f c2 50
>
> i.e., the bytes are returned.
>
> The output from 2.6.22 is different:
>
> Jul 9 18:44:35 kai kernel: [ 147.765965] ata_qc_complete before: 00 00 00 40
> Jul 9 18:44:35 kai kernel: [ 147.765970] ata_qc_complete 16: 00 00 00 50
> Jul 9 18:44:35 kai kernel: [ 147.784890] ata_qc_complete before: 00 00 00 40
> Jul 9 18:44:35 kai kernel: [ 147.784894] ata_qc_complete 16: 00 00 00 50
>
> The lbam and lbah bytes are not returned but the command byte is.
>
The other system with the Maxtor disk fails in a slightly different way
(it correctly returns the c2 byte but not in the correct location):
[ 162.896173] ata_qc_complete before: 00 00 00 40
[ 162.896179] ata_qc_complete 16: 00 c2 00 50
My earlier 'git bisect' suggested that this problem surfaced after the
patch
1e999736cafdffc374f22eed37b291129ef82e4e is first bad commit
commit 1e999736cafdffc374f22eed37b291129ef82e4e
Author: Alan Cox <alan@...rguk.ukuu.org.uk>
Date: Wed Apr 11 00:23:13 2007 +0100
libata: HPA support
I have now done some further tests to see what is happening.
It turned out that after commenting the call (at line 1956 in
drivers/ata/libata-core.c in 2.6.22)
if (ata_id_hpa_enabled(dev->id))
dev->n_sectors = ata_hpa_resize(dev);
'smartctl -H' worked again without problems. This applied to both of the
systems where I see the problem. The disks in both systems support hpa but
nothing is hidden. Next I commented only the call to
ata_read_native_max_address_ext() in ata_hpa_resize(). This was enough
to remove the problem (as was expected).
So, the question is: why does calling ata_read_native_max_address_ext()
when booting the system cause the SMART RETURN STATUS fail much later?
--
Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists