[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <487507A1.2020100@goop.org>
Date: Wed, 09 Jul 2008 11:46:57 -0700
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Mike Travis <travis@....com>
CC: Christoph Lameter <cl@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"H. Peter Anvin" <hpa@...or.com>, Jack Steiner <steiner@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC 00/15] x86_64: Optimize percpu accesses
Mike Travis wrote:
> Christoph Lameter wrote:
>
>> Mike Travis wrote:
>>
>>
>>> I think Jeremy's point is that by removing the pda struct entirely, the
>>> references to the fields can be the same for both x86_32 and x86_64.
>>>
>> That is going to be difficult. The GS register is tied up for the pda area
>> as long as you have it. And you cannot get rid of the pda because of the library
>> compatibility issues. We would break binary compatibility if we would get rid of the pda.
>>
>> If one attempts to remove one field after another then the converted accesses will not be able to use GS relative accesses anymore. This can lead to all sorts of complications.
>>
>> It will be possible to shrink the pda (as long as we maintain the fields that glibc needs) after this patchset because the pda and the per cpu area can both be reached with the GS register. So (apart from undiscovered surprises) the generated code is the same.
>>
>
> Is there a comprehensive list of these library accesses to variables
> offset from %gs, or is it only the "stack_canary"?
It's just the stack canary. It isn't library accesses; it's the code
gcc generates:
foo: subq $152, %rsp
movq %gs:40, %rax
movq %rax, 136(%rsp)
...
movq 136(%rsp), %rdx
xorq %gs:40, %rdx
je .L3
call __stack_chk_fail
.L3:
addq $152, %rsp
.p2align 4,,4
ret
There are two irritating things here:
One is that the kernel supports -fstack-protector for x86-64, which
forces us into all these contortions in the first place. We don't
support stack-protector for 32-bit (gcc does), and things are much easier.
The other somewhat orthogonal irritation is the fixed "40". If they'd
generated %gs:__gcc_stack_canary, then we could alias that to a per-cpu
variable like anything else and the whole problem would go away - and we
could support stack-protector on 32-bit with no problems (and normal
usermode could define __gcc_stack_canary to be a weak symbol with value
"40" (20 on 32-bit) for backwards compatibility).
I'm close to proposing that we run a post-processor over the generated
assembly to perform the %gs:40 -> %gs:__gcc_stack_canary transformation
and deal with it that way.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists