[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48694B3B.3010600@goop.org>
Date: Mon, 30 Jun 2008 14:08:11 -0700
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: Mike Travis <travis@....com>, "H. Peter Anvin" <hpa@...or.com>,
Christoph Lameter <clameter@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [crash, bisected] Re: [PATCH 3/4] x86_64: Fold pda into per cpu
area
Eric W. Biederman wrote:
> Mike Travis <travis@....com> writes:
>
>
>> H. Peter Anvin wrote:
>>
>>> Mike Travis wrote:
>>>
>>>> FYI, I did try this out and it caused the bootloader to scramble the
>>>> loaded data. The first corruption I found was the .x86cpuvendor.init
>>>> section contained all zeroes.
>>>>
>>>>
>>> Explain what you mean with "the bootloader" in this context.
>>>
>>> -hpa
>>>
>> After the code was loaded (the compressed code, it seems that my GRUB
>> doesn't support uncompressed loading), the above section contained
>> zeroes. I snapped it fairly early, around secondary_startup_64, and
>> then printed it in x86_64_start_kernel.
>>
>> The object file had the correct data (as displayed by objdump) so I'm
>> assuming that the bootloading process didn't load the section correctly.
>>
>> Below was the linker script I used:
>>
>> --- linux-2.6.tip.orig/include/asm-generic/vmlinux.lds.h
>> +++ linux-2.6.tip/include/asm-generic/vmlinux.lds.h
>> @@ -373,9 +373,13 @@
>>
>> #ifdef CONFIG_HAVE_ZERO_BASED_PER_CPU
>> #define PERCPU(align) \
>> - . = ALIGN(align); \
>> + .data.percpu.abs = .; \
>> percpu : { } :percpu \
>> - __per_cpu_load = .; \
>> + .data.percpu.rel : AT(.data.percpu.abs - LOAD_OFFSET) { \
>> + BYTE(0) \
>> + . = ALIGN(align); \
>> + __per_cpu_load = .; \
>> + } \
>> .data.percpu 0 : AT(__per_cpu_load - LOAD_OFFSET) { \
>> *(.data.percpu.first) \
>> *(.data.percpu.shared_aligned) \
>> @@ -383,8 +387,8 @@
>> *(.data.percpu.page_aligned) \
>> ____per_cpu_size = .; \
>> } \
>> - . = __per_cpu_load + ____per_cpu_size; \
>> - data : { } :data
>> + . = __per_cpu_load + ____per_cpu_size;
>> +
>> #else
>> #define PERCPU(align) \
>> . = ALIGN(align); \
>>
>> It showed all the correct address in the map and __per_cpu_load was a
>> relative symbol (which was the objective.)
>>
>> Btw, our simulator, which only loads uncompressed code, had the data correct,
>> so it *may* only be a result of the code being compressed.
>>
>
> Weird. Grub doesn't get involved in the decompression the kernel does it
> all itself so we should be able to track where things go bad.
>
> Last I looked the compressed code was formed by essentially.
> objcopy vmlinux -O binary vmlinux.bin
> gzip vmlinux.bin
> And then we take on a magic header to the gzip compressed file.
>
> Are things only bad with the change above?
No, the original crash being discussed was a GP fault in head_64.S as it
tries to initialize the kernel segments. The cause was that the
prototype GDT is all zero, even though it's an initialized variable, and
inspection of vmlinux shows that it has the right contents. But somehow
it's either 1) getting zeroed on load, or 2) is loaded to the wrong place.
The zero-based PDA mechanism requires the introduction of a new ELF
segment based at vaddr 0 which is sufficiently unusual that it wouldn't
surprise me if its triggering some toolchain bug.
Mike: what would happen if the PDA were based at 4k rather than 0? The
stack canary would still be at its small offset (0x20?), but it doesn't
need to be initialized. I'm not sure if doing so would fix anything,
however.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists