[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20060827172155.GA21724@rhlx01.fht-esslingen.de>
Date: Sun, 27 Aug 2006 19:21:55 +0200
From: Andreas Mohr <andi@...x01.fht-esslingen.de>
To: Jeremy Fitzhardinge <jeremy@...p.org>
Cc: linux-kernel@...r.kernel.org,
Chuck Ebbert <76306.1226@...puserve.com>,
Zachary Amsden <zach@...are.com>,
Jan Beulich <jbeulich@...ell.com>, Andi Kleen <ak@...e.de>,
Andrew Morton <akpm@...l.org>
Subject: Re: [PATCH RFC 0/6] Implement per-processor data areas for i386.
Hi,
On Sun, Aug 27, 2006 at 01:44:17AM -0700, Jeremy Fitzhardinge wrote:
> This patch implements per-processor data areas by using %gs as the
> base segment of the per-processor memory. This has two principle
> advantages:
>
> - It allows very simple direct access to per-processor data by
> effectively using an effective address of the form %gs:offset, where
> offset is the offset into struct i386_pda. These sequences are faster
> and smaller than the current mechanism using current_thread_info().
Yess!!
Something like that had to be done eventually about the inefficient
current_thread_info() mechanism, but I wasn't sure what exactly.
> I haven't measured performance yet, but when using the PDA for "current"
> and "smp_processor_id", I see a 5715 byte reduction in .text segment
> size for my kernel.
This is interesting, since even by doing a non-elegant
current->... --> struct task_struct *tsk = current;
replacement for excessive uses of current, I was able to gain almost 10kB
within a single file already!
I guess it's due to having tried that on an older installation with gcc 3.2,
which probably does less efficient opcode merging of current_thread_info()
requests compared to a current gcc version.
IOW, .text segment reduction could be quite a bit higher for older gcc:s.
> This uses the x86 segmentation stuff in a way similar to NPTL's way of
> implementing Thread-Local Storage. It relies on the fact that each CPU
> has its own Global Descriptor Table (GDT), which is basically an array
> of base-length pairs (with some extra stuff). When a segment register
> is loaded with a descriptor (approximately, an index in the GDT), and
> you use that segment register for memory access, the address has the
> base added to it, and the resulting address is used.
Not a problem for more daring user-space apps (i.e. Wine), I hope?
Andreas Mohr
--
No programming skills!? Why not help translate many Linux applications!
https://launchpad.ubuntu.com/rosetta
(or alternatively buy nicely packaged Linux distros/OSS software to help
support Linux developers creating shiny new things for you?)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists