[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200611151227.04777.dada1@cosmosbay.com>
Date: Wed, 15 Nov 2006 12:27:04 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: akpm@...l.org
Cc: Arjan van de Ven <arjan@...radead.org>,
Jeremy Fitzhardinge <jeremy@...p.org>, ak@...e.de,
mingo@...e.hu, linux-kernel@...r.kernel.org
Subject: [PATCH] i386-pda UP optimization
Seeing %gs prefixes used now by i386 port, I recalled seeing strange oprofile
results on Opteron machines.
I really think %gs prefixes can be expensive in some (most ?) cases, even if
the Intel/AMD docs say they are free.
I wrote this trivial User program to benchmark vfs_read()/vfs_write() that
happens to use 'current' many times.
#include <unistd.h>
#include <errno.h>
int main()
{
int i, fd[2];
char c = 0;
pipe(fd);
for (i = 0; i < 10000000; i++) {
errno = 0; // glibc also use %gs
write(fd[1], &c, 1);
read(fd[0], &c, 1);
}
return 0;
}
The best elap time I got for this program on 10 runs was : 12.811 s
(Intel(R) Pentium(R) M processor 1.60GHz)
With the attached patch, I got 12.212 s, and a kernel text size reduction of
3400 bytes.
I wish Jeremy give us patches for UP machines so that %gs can be let untouched
in entry.S (syscall entry/exit). A lot of ia32 machines are still using one
CPU.
Note : I dont have a x86_64 machine here, but I suspect a similar patch could
be done for x86_64 too.
Thank you
[PATCH] i386-pda UP optimization
On a !CONFIG_SMP machine, there is only one PDA, (one CPU).
We can avoid %gs prefixes when reading/writing fields in PDA.
This reduce kernel text size and also give better performance.
Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
View attachment "i386-pda-up.patch" of type "text/plain" (2183 bytes)
Powered by blists - more mailing lists