lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200611151227.04777.dada1@cosmosbay.com>
Date:	Wed, 15 Nov 2006 12:27:04 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	akpm@...l.org
Cc:	Arjan van de Ven <arjan@...radead.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>, ak@...e.de,
	mingo@...e.hu, linux-kernel@...r.kernel.org
Subject: [PATCH] i386-pda UP optimization

Seeing %gs prefixes used now by i386 port, I recalled seeing strange oprofile 
results on Opteron machines.

I really think %gs prefixes can be expensive in some (most ?) cases, even if 
the Intel/AMD docs say they are free.

I wrote this trivial User program to benchmark vfs_read()/vfs_write() that 
happens to use 'current' many times.

#include <unistd.h>
#include <errno.h>

int main()
{
        int i, fd[2];
        char c = 0;
        pipe(fd);
        for (i = 0; i < 10000000; i++) {
                errno = 0; // glibc also use %gs
                write(fd[1], &c, 1);
                read(fd[0], &c, 1);
        }
        return 0;
}

The best elap time I got for this program on 10 runs was : 12.811 s
(Intel(R) Pentium(R) M processor 1.60GHz)

With the attached patch, I got 12.212 s, and a kernel text size reduction of 
3400 bytes.

I wish Jeremy give us patches for UP machines so that %gs can be let untouched 
in entry.S (syscall entry/exit). A lot of ia32 machines are still using one 
CPU.

Note : I dont have a x86_64 machine here, but I suspect a similar patch could 
be done for x86_64 too.

Thank you

[PATCH] i386-pda UP optimization

On a !CONFIG_SMP machine, there is only one PDA, (one CPU).
We can avoid %gs prefixes when reading/writing fields in PDA.
This reduce kernel text size and also give better performance.

Signed-off-by: Eric Dumazet <dada1@...mosbay.com>

View attachment "i386-pda-up.patch" of type "text/plain" (2183 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ