linux-kernel - Re: Linux 2.6.29-rc1 MAJOR advisory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 11 Jan 2009 23:15:56 -0800
From:	"Justin P. Mattock" <justinmattock@...il.com>
To:	Torsten Kaiser <just.for.lkml@...glemail.com>
CC:	Gene Heskett <gene.heskett@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29-rc1 MAJOR advisory

Torsten Kaiser wrote:
> On Mon, Jan 12, 2009 at 3:50 AM, Gene Heskett <gene.heskett@...il.com> wrote:
>   
>> On Sunday 11 January 2009, Torsten Kaiser wrote:
>>     
>>> On Sun, Jan 11, 2009 at 9:10 PM, Gene Heskett <gene.heskett@...il.com> wrote:
>>>       
>>>> I don't believe it is.  MAJOR problem. I have an ASUS M2N-SLI Deluxe
>>>> motherboard I paid about $275 for in late Sept 2008, and one attempt to
>>>> boot the 2.6.29-rc1 I had built destroyed the MCP55 eth0 port, no power on
>>>> the port at all now, and I've rebooted to 2.6.28, still no eth0, so I have
>>>> now enabled in the bios and am using the 2nd & last eth1 port on this
>>>> mobo.
>>>>         
>>> I have also an ASUS MCP55 board, a KFN5-D.
>>>
>>> To save the crash I reported in the "[git pull] x86 fixes" thread, I
>>> had to boot the patch -rc1 a second time.
>>> After saving the Oops on my second pc I rebooted my test system (the
>>> one with the MCP55) into 2.6.28 and the boot process hung as it wanted
>>> to mount its NFS filesystems. Trying to connect from the second system
>>> failed, not even a ping reply.
>>> But: Just removing the ethernet cable and immediately reconnecting it
>>> seemed to have kicked my MCP55 ethernet port back in working order.
>>>
>>>       
>> I unplugged it and plugged it back in a couple of times.
>>     
>
> I just wanted to report my observations as that might help somebody to
> debug this problem.
> I assumed that you already tried unplugging the ethernet cable and
> would only suggest a complete powerdown instead of only rebooting, but
> as you wrote later in your mail, you already tried that. :-(
>
>   
>> Absolutely NO led
>> activity in the connector was observed, but since this board has 2 ethernet
>> ports, the other port lit up like the 4th of July when I stuck the cable
>> into it.  So I rebooted, and enabled that port in the bios, then booted 2.6.28,
>> copied /etc/sysconfig/network-scripts/ifcfg-eth0 to ifcfg-eth1, edited it to
>> call itself eth1 without even changing the mac address, did a
>> 'service network restart' which reported a failure downing eth0, then another
>> upping it, and success upping eth1.  Pinged yahoo, works.
>>
>> I will call my friend at the shop where I bought all this and see if he can
>> arrange a preship of another board since ASUS has a years warranty.  But to
>> me, its pretty fishy that it was working normally when I shut down 2.6.28,
>> failed on the boot to 2.6.29-rc1, twice, and was still dead when 2.6.28
>> was rebooted.  That points an awfully straight and strong finger at 2.6.29-rc1.
>>
>>     
>>> No fishy things in the syslog...
>>>       
>> As you can see in the dmesg I attached, I had problems from the gitgo.
>> But just for grins, I'll check messages too, for the first boot, hang on a sec.
>>
>> First was the usual your bios is crap, fixing it notice, then:
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI: RSDP 000F7D20, 0024 (r2 Nvidia)
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI: XSDT DFEE3100, 004C (r1 Nvidia ASUSACPI 42302E31 AWRD
>> 0)
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI: FACP DFEEADC0, 00F4 (r3 Nvidia ASUSACPI 42302E31 AWRD
>> 0)
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in
>> Pm1aEventBlock: 32/8 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in
>> Pm1aControlBlock: 16/8 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in
>> PmTimerBlock: 32/8 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in Gpe0Block:
>> 64/8 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in Gpe1Block:
>> 128/8 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0412): Invalid length for Pm1aEventBlock: 8,
>> using default 32 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0412): Invalid length for Pm1aControlBlock:
>> 8, using default 16 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] ACPI Warning (tbfadt-0412): Invalid length for PmTimerBlock: 8,
>> using default 32 [20081204]
>> Jan 11 14:15:13 coyote kernel: [    0.000000] FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN
>> (4)
>>
>> No idea what that means.
>>     
>
> Something is wrojng with your ACPI tables.
> That might have several causes:
> a) a new check in 2.6.29-rc1 that now detects an error that was always there
> b) the cause for the dead eth0 is a corruption in your bios
>
> For me I see this with 2.6.29-rc1:
> [    0.000000] FADT: X_PM1a_EVT_BLK.bit_width (16) does not match
> PM1_EVT_LEN (4)
>
> 2.6.28 does not complain about that.
>
> It would probably be good to compare the boot messages from an older
> kernel when eth0 was still working with the boot messages from the
> same kernel now that the port is dead.
> Any differences might give a better clue than all the errors from 2.6.29...
>
>   
>> Then:
>>     
> [snip]
>   
>> Jan 11 14:15:13 coyote kernel: [   18.231257] Oops: 0002 [#1] PREEMPT SMP
>>     
> [snip]
>   
>> Jan 11 14:15:13 coyote kernel: [   18.232006] Pid: 1724, comm: modprobe Tainted: G        W  (2.6.29-rc1 #1)
>>     
> [snip]
>   
>> Now the above claims I am 'tainted' but I am not.
>>     
>
> The taint is because this is the second problem that the kernel
> encounterd, the first one from the log you attached to your first post
> was:
> [    0.000000] ------------[ cut here ]------------
> [    0.000000] WARNING: at arch/x86/kernel/cpu/mtrr/generic.c:398
> generic_get_mtrr+0x11c/0x130()
> [    0.000000] Hardware name: System Product Name
> [    0.000000] mtrr: your BIOS has set up an incorrect mask, fixing it up.
> [    0.000000] Modules linked in:
> [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.29-rc1 #1
> [    0.000000] Call Trace:
> [    0.000000]  [<c0428939>] warn_slowpath+0x99/0xc0
> [    0.000000]  [<c043f561>] up+0x11/0x40
> [    0.000000]  [<c042906d>] release_console_sem+0x18d/0x1c0
> [    0.000000]  [<c0600020>] iret_exc+0x1dc/0x882
> [    0.000000]  [<c06ffac1>] dmi_string_nosave+0x51/0x70
> [    0.000000]  [<c042962b>] printk+0x1b/0x20
> [    0.000000]  [<c041ba8c>] pat_init+0x7c/0xa0
> [    0.000000]  [<c040fd5c>] generic_get_mtrr+0x11c/0x130
> [    0.000000]  [<c06ea68b>] mtrr_trim_uncached_memory+0x7b/0x360
> [    0.000000]  [<c06eaa41>] mtrr_bp_init+0xd1/0x700
> [    0.000000]  [<c042962b>] printk+0x1b/0x20
> [    0.000000]  [<c06e7125>] e820_end_pfn+0xc5/0xf0
> [    0.000000]  [<c06e5689>] setup_arch+0x409/0x980
> [    0.000000]  [<c06e864b>] reserve_early_overlap_ok+0x4b/0x60
> [    0.000000]  [<c06e1a0a>] start_kernel+0x6a/0x2e0
> [    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
>
> The W-Taint is there to notify that there already was some (other) problem.
>
>   
>> Then, just about 10 lines later:
>> Jan 11 14:20:57 coyote kernel: [   22.214383] eth0: no link during initialization.
>> Jan 11 14:20:57 coyote kernel: [   23.784997] eth0: link up.
>>
>> But it wasn't. ifconfig said it was, but no pings worked.  So I fixed the
>> ifcfg-eth1 script to run, rebooted, enabling the other port in the bios as I
>> did so, and here I am.  The one thing I haven't done is a full powerdown,
>> which is next.
>>
>> And I have now done that full powerdown reset, but the eth0 port is still dark
>> and powerless.
>>
>> And this is all that showed up in dmesg's output when I moved the cable back
>> for a few seconds:
>>
>> [  135.940984] eth1: link down.
>> [  145.266298] eth1: link up.
>>
>> No note that a cable had been plugged into eth0. But it was.
>>     
>
> Any other messages about eth0 in the syslog?
> It would probably be most helpful, if you could compare any messages
> about eth0/ the MCP55 from a known good boot with the current
> output...
>
>   
>> This could be a 'co-inky dance', but my almost 60 years in electronics
>> troubleshooting says there is a connection.
>>
>> I sure won't reboot to 2.6.29-rc1 again until I have a replacement
>> motherboard sitting next to me, I don't want to wreck the last port
>> cuz I have no slots left to stick a nic in this one.
>>     
>
> Torsten
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>   
It's just a warning(or something like that);
anyways I have three machines:
dell inspiron 1200(I know old,but works)
dell x200(200 something, slower than sh*t);
macbook pro(ati chipset)
all three display the same message:

FADT: X_PM1a_EVT_BLK.bit_width etc...

maybe the table is too sensitive!

regards;

Justin P. Mattock 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/