lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 18 Mar 2008 11:59:48 +0100
From:	Volker Armin Hemmann <volker.armin.hemmann@...clausthal.de>
To:	Tejun Heo <htejun@...il.com>
Cc:	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org
Subject: Re: 2.6.24.X: SATA/AHCI related boot delay. - not with 2.6.24.3

On Dienstag, 18. März 2008, Tejun Heo wrote:
> Hello,
>
> Volker Armin Hemmann wrote:
> > I tried some more stuff, replaced the cables, played with bios settings.
> >
> > No change.
> >
> > Then I updated to 2.6.24.3 - and no hangs or 'softreset' failures
> > anymore.
>
> I don't see any libata changes which can cause such difference.  Weird.
>  Is this result reliably reproducible?

it was for a little bit more than 24h. I booted and rebooted several times to 
make sure - and everything was fine, but after a good night and on the Xth 
boot, the hang occured again - and since then it is there. Reliable on every 
boot :( 
(and the softreset failed message on reboots).
Of course, I booted and rebooted several times. And it stays.

Maybe it is the hardware. But I replaced the cables already and smart says the 
disk is ok.

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  
LBA
_of_first_error
# 1  Short offline       Completed without error       00%      8124         -
# 2  Short offline       Completed without error       00%      8067         -
# 3  Short offline       Completed without error       00%      3402         -
# 4  Extended offline    Completed without error       00%      3374         -
<snip>



>
> > 2.6.24.2 and 2.6.24.3 both have the reiser4 patch added.
> >
> > So
> > 2.6.24.2 = bad
> > 2.6.24.3 = good

it only was good for a couple of boots. 

> > 2.6.25-rc5 = bad
> >
> > Setting AHCI in bios still results in timeouts and harddisks not found.
>
> Does pci=nomsi help?

oh yes!
 It does. I changed the 'Sata operation mode' setting from 'non raid' to AHCI, 
booted with that option:

[   35.026629] Driver 'sd' needs updating - please use bus_type methods
[   35.026702] ahci 0000:00:0a.0: version 3.0
[   35.026877] ACPI: PCI Interrupt Link [LSA0] enabled at IRQ 23
[   35.026922] ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LSA0] -> GSI 23 
(level, low) -> IRQ 23
[   36.029726] ahci 0000:00:0a.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf 
impl SATA mode
[   36.029777] ahci 0000:00:0a.0: flags: 64bit sntf led clo pmp pio
[   36.029817] PCI: Setting latency timer of device 0000:00:0a.0 to 64
[   36.030019] scsi0 : ahci
[   36.030114] scsi1 : ahci
[   36.030180] scsi2 : ahci
[   36.030245] scsi3 : ahci
[   36.030333] ata1: SATA max UDMA/133 abar m8192@...9dfc000 port 0xf9dfc100 
irq 23
[   36.030381] ata2: SATA max UDMA/133 abar m8192@...9dfc000 port 0xf9dfc180 
irq 23
[   36.030428] ata3: SATA max UDMA/133 abar m8192@...9dfc000 port 0xf9dfc200 
irq 23
[   36.030476] ata4: SATA max UDMA/133 abar m8192@...9dfc000 port 0xf9dfc280 
irq 23
[   36.659406] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   36.660019] ata1.00: ATA-7: WDC WD1600JS-00MHB1, 10.02E01, max UDMA/133
[   36.660062] ata1.00: 312581808 sectors, multi 16: LBA48
[   36.660680] ata1.00: configured for UDMA/133
[   37.291368] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   37.311928] ata2.00: ATA-8: SAMSUNG HD501LJ, CR100-12, max UDMA7
[   37.311968] ata2.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[   37.313968] ata2.00: configured for UDMA/133
[   37.630631] ata3: SATA link down (SStatus 0 SControl 300)
[   37.949937] ata4: SATA link down (SStatus 0 SControl 300)
[   37.950026] scsi 0:0:0:0: Direct-Access     ATA      WDC WD1600JS-00M 10.0 
PQ: 0 ANSI: 5
[   37.950136] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 
MB)
[   37.950181] sd 0:0:0:0: [sda] Write Protect is off
[   37.950219] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   37.950227] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   37.950297] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 
MB)
[   37.950340] sd 0:0:0:0: [sda] Write Protect is off
[   37.950378] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   37.950385] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   37.950434]  sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
[   37.984324] sd 0:0:0:0: [sda] Attached SCSI disk
[   37.984459] scsi 1:0:0:0: Direct-Access     ATA      SAMSUNG HD501LJ  CR10 
PQ: 0 ANSI: 5
[   37.984568] sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 
MB)
[   37.984612] sd 1:0:0:0: [sdb] Write Protect is off
[   37.984650] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   37.984659] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   37.984729] sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 
MB)
[   37.984772] sd 1:0:0:0: [sdb] Write Protect is off
[   37.984810] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   37.984817] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   37.984866]  sdb: sdb1 sdb2
[   37.994164] sd 1:0:0:0: [sdb] Attached SCSI disk

the most obvious change are the different interrupts: 23 instead of 315 
(non-raid, without nosmi) or 218 (systemrescuecd 1.0, ahci setting, without 
nosmi)

full dmesg is attached.

Thanks for looking into this mess.

Glück Auf,
Volker

View attachment "dmesg_2.6.24.3_nosmi" of type "text/plain" (25063 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ