lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 09 Apr 2008 16:29:53 +0200
From:	Tilman Schmidt <tilman@...p.cc>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org,
	jgarzik@...ox.com
Subject: Re: 2.6.25-rc8-mm1: Intel SATA boot failure

This is taking longer than I hoped, so here's a little progress report.

On Thu, 3 Apr 2008 16:17:33 -0700, Andrew Morton wrote:
> On Fri, 04 Apr 2008 01:08:19 +0200
> Tilman Schmidt <tilman@...p.cc> wrote:
> 
>> This fails to come up on my development machine, apparently because it
>> has trouble accessing the SATA hard disks.
>> Hardware: Intel Pentium D940, Intel DQ965GF board, two SATA hard disks.
>> Some unusual things I noticed during the boot process:
>> 
>> - a message "doing fast boot" that looked unfamiliar; unfortunately
>>   it scrolled off too quickly to note its context

That message doesn't make it into dmesg. It's apparently a Suse thing,
and perhaps normal, so let's ignore that for now.

>> - for each of the two SATA ports in use, a message
>>   "SATA port is slow to respond, please be patient"
>>   accompanied by about 10 secs wait

These messages seem to be a separate issue. I also get them with
a .config that otherwise brings up the system successfully. That
allowed me to capture a dmesg, so here are some possibly interesting
hunks of the diff between a mainline kernel and a working 2.6.25-rc8-mm1
one:

--- dmesg-2.6.25-rc8-git.nots-reordered 2008-04-09 15:29:52.000000000 +0200
+++ dmesg-2.6.25-rc8-mm1.nots   2008-04-09 00:48:42.000000000 +0200
@@ -1,4 +1,4 @@
- Linux version 2.6.25-rc8-testing-00210-g51ac03f (ts@...on) (gcc version 4.2.1 (SUSE Linux)) #37 SMP PREEMPT Wed Apr 9 01:27:07 CEST 2008
+ Linux version 2.6.25-rc8-mm1-testing (ts@...on) (gcc version 4.2.1 (SUSE Linux)) #6 SMP PREEMPT Wed Apr 9 00:24:23 CEST 2008
   BIOS-provided physical RAM map:
    BIOS-e820: 0000000000000000 - 000000000008f000 (usable)
    BIOS-e820: 000000000008f000 - 00000000000a0000 (reserved)

[...]

@@ -244,12 +277,10 @@
   CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
   CPU1: Thermal monitoring enabled
   CPU1: Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
- Total of 2 processors activated (12796.06 BogoMIPS).
+ Total of 2 processors activated (12796.87 BogoMIPS).
   ENABLING IO-APIC IRQs
   ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
- checking TSC synchronization [CPU#0 -> CPU#1]:
- Measured 560 cycles TSC warp between CPUs, turning off TSC clock.
- Marking TSC unstable due to: check_tsc_sync_source failed.
+ checking TSC synchronization [CPU#0 -> CPU#1]: passed.
   Brought up 2 CPUs
   CPU0 attaching sched-domain:
    domain 0: span 03

[Nice - at last a kernel that likes my TSC; not sure if it matters though.]

@@ -846,26 +880,36 @@
   PCI: Setting latency timer of device 0000:00:1f.2 to 64
   scsi0 : ahci
   PM: Adding info for No Bus:host0
+ PM: Adding info for No Bus:host0
   scsi1 : ahci
   PM: Adding info for No Bus:host1
+ PM: Adding info for No Bus:host1
   scsi2 : ahci
   PM: Adding info for No Bus:host2
+ PM: Adding info for No Bus:host2
   scsi3 : ahci
   PM: Adding info for No Bus:host3
+ PM: Adding info for No Bus:host3
   scsi4 : ahci
   PM: Adding info for No Bus:host4
+ PM: Adding info for No Bus:host4
   scsi5 : ahci
   PM: Adding info for No Bus:host5
+ PM: Adding info for No Bus:host5
   ata1: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25100 irq 217
   ata2: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25180 irq 217
   ata3: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25200 irq 217
   ata4: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25280 irq 217
   ata5: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25300 irq 217
   ata6: SATA max UDMA/133 abar m2048@...2c25000 port 0x92c25380 irq 217
+ ata1: port is slow to respond, please be patient (Status 0x80)
+ ata1: COMRESET failed (errno=-16)
   ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
   ata1.00: ATA-7: ST380811AS, 3.AAE, max UDMA/133
   ata1.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
   ata1.00: configured for UDMA/133
+ ata2: port is slow to respond, please be patient (Status 0x80)
+ ata2: COMRESET failed (errno=-16)
   ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
   ata2.00: ATA-7: ST380811AS, 3.AAE, max UDMA/133
   ata2.00: 156301488 sectors, multi 0: LBA48 NCQ (depth 31/32)
@@ -877,15 +921,22 @@
   PM: Adding info for No Bus:target0:0:0
   scsi 0:0:0:0: Direct-Access     ATA      ST380811AS       3.AA PQ: 0 ANSI: 5
   PM: Adding info for scsi:0:0:0:0
+ PM: Adding info for No Bus:0:0:0:0
+ PM: Adding info for No Bus:0:0:0:0
   PM: Adding info for No Bus:target1:0:0
   scsi 1:0:0:0: Direct-Access     ATA      ST380811AS       3.AA PQ: 0 ANSI: 5
   PM: Adding info for scsi:1:0:0:0
+ PM: Adding info for No Bus:1:0:0:0
+ PM: Adding info for No Bus:1:0:0:0
+ modprobe used greatest stack depth: 2200 bytes left
   ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 17 (level, low) -> IRQ 17
   PCI: Setting latency timer of device 0000:02:00.0 to 64
   scsi6 : pata_marvell
   PM: Adding info for No Bus:host6
+ PM: Adding info for No Bus:host6
   scsi7 : pata_marvell
   PM: Adding info for No Bus:host7
+ PM: Adding info for No Bus:host7
   ata7: PATA max UDMA/100 cmd 0x2018 ctl 0x2024 bmdma 0x2000 irq 17
   ata8: DUMMY
   BAR5:00:00 01:7F 02:22 03:CA 04:00 05:00 06:00 07:00 08:00 09:00 0A:00 0B:00 0C:01 0D:00 0E:00 0F:00

After that, it quickly diverges because of major differences in the
ordering of device startup. Already, to get the diffs above I had to
shift a few blocks of lines around.

>> Before I try booting that kernel again, any instructions on what to
>> watch out for? Is netconsole usable again?
> 
> Yes, netconsole is usable again ;)

Good to know. I'm now looking in my netconsole setup for the reason
why I don't receive anything from it.

Thanks,
Tilman

-- 
Tilman Schmidt                    E-Mail: tilman@...p.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


Download attachment "signature.asc" of type "application/pgp-signature" (251 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ