lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200706050300.56079.bzolnier@gmail.com>
Date:	Tue, 5 Jun 2007 03:00:55 +0200
From:	Bartlomiej Zolnierkiewicz <bzolnier@...il.com>
To:	Sergei Shtylyov <sshtylyov@...mvista.com>
Cc:	Geller Sandor <wildy@...ra.hos.u-szeged.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org
Subject: Re: HPT374 IDE problem with 2.6.21.* kernels


Hello,

On Sunday 03 June 2007, Sergei Shtylyov wrote:
> Geller Sandor wrote:
> Hello.
> 
> >>>>> The log of a typical IDE reset is available here:
> 
> >>>>> http://petra.hos.u-szeged.hu/~wildy/syslog.gz
> 
> >>>>> This was the worst case: the IDE bus was resetted during the system 
> >>>>> boot.
> 
> >>>>   Could you try setting HPT374_ALLOW_ATA133_6 to 0 in
> >>>> drivers/ide/pci/hpt366.c and rebuild/reboot the kernel?
> 
> >>> Hi Sergei,
> 
> >>> This looks promising. Using a vanilla 2.6.22-rc3 I was able to reproduce
> >>> the problem within a few seconds. With the above modification the 
> >>> machine
> >>> is running under heavy disk I/O without problems since 30 minutes...
> 
> >> Did it fix the problem for good?
> 
> > It seems so far. There hasn't been any problem since I've applied the fix.
> 
> >> Sergei, do we need to disallow UDMA6 completely on HPT734 or
> >> is it only an issue with some problematic devices (=> blacklist)?
> 
>     Note that I didn't change what the old code was doing in this regard -- 
> although the HPT374 spec does *not* say that UDMA6 is supported, it had been 
> enabled. What have *really* changed for HPT374 was:
> 
> - in 2.6.20-rc1, the driver switched to using the actual 33 MHz timing table
>    instead of the old one, matching 50 MHz (and so, severely underclocked);
> 
> - in 2.6.2-rc1, the driver switched from 33 MHz PCI to 66 MHz DPLL clock.
> 
>     Disallowing UDMA6 would clock the chip with 50 MHz DPLL, howewer, the 

I felt inspired by this explanation (thanks!) and took a look at
hpt374-opensource-v2.10 vendor driver.  Here is something interesting:

glbdata.c:

...
#ifdef CLOCK_66MHZ
ULONG setting370_66[] = {
        0xd029d5e,  0xd029d26,  0xc829ca6,  0xc829c84,  0xc829c62,
        0x2c829d2c, 0x2c829c66, 0x2c829c62,
        0x1c829c62, 0x1c9a9c62, 0x1c929c62, 0x1c8e9c62, 0x1c8a9c62,
        0x1c8a9c62/*0x1cae9c62*/, 0x1c869c62, 0x1c869c62,
};
...

hpt366.c:

...
static u32 sixty_six_base_hpt37x[] = {
        /* XFER_UDMA_6 */       0x1c869c62,
        /* XFER_UDMA_5 */       0x1cae9c62,     /* 0x1c8a9c62 */
...

So we are using Dual ATA Clock for UDMA5 whereas vendor driver doesn't
(the only other mode which uses Dual ATA Clock, in both drivers, is rarely
used UDMA3).

Thanks to this UDMA cycle time should be equal 22.5ns instead of 30ns
(spec defines it at 16.8ns, ide_timings[] uses 20ns) when using 66 MHz DPLL
clock.  In theory everything should play nice but the data manual for HPT374
contains weird note that Dual ATA Clock is meant to implement ATA100 read
and write at different clocks (there is no more explanation to this).

Geller reported that the problems started after migrating from 2.6.20.7 to
2.6.21.1 (the affected disks are using UDMA5) and at the same time the driver
switched from 33 MHz PCI to 66 MHz DPLL clock.  Also the issue is completely
fixed by using 50 MHz DPLL clock (UDMA5 timing for 50 MHz DPLL clock is
0x12848242 so UDMA cycle time equals 20ns and is smaller than the one
obtained using 66 MHz DPLL clock).

It all makes me wonder whether it is really safe to use Dual ATA Clock for
UDMA5 and whether we should just be using "the offical" timing instead...

Sergei?

> original report claimed that something has changed to worse between 2.6.21.1 
> and .3 but nothing changed in drivers/ide/ between those releases...

It could be that md changes from 2.6.21.3 have influenced the situation
(by putting more stress on disks etc)...

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ