[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20061204.135625.48528445.davem@davemloft.net>
Date: Mon, 04 Dec 2006 13:56:25 -0800 (PST)
From: David Miller <davem@...emloft.net>
To: dada1@...mosbay.com
Cc: akpm@...l.org, clameter@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] SLAB : use a multiply instead of a divide in
obj_to_index()
From: Eric Dumazet <dada1@...mosbay.com>
Date: Mon, 04 Dec 2006 22:34:29 +0100
> On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1 cycle
> for a multiply.
For UltraSPARC I and II (which is what this 200mhz guy probably is),
it's 4 cycle latency for a multiply (32-bit or 64-bit) and 68 cycles
for a 64-bit divide (32-bit divide is 37 cycles).
UltraSPARC-III and IV are worse, 6 cycles for multiply and 40/71
cycles (32/64-bit) for integer divides.
Niagara is even worse :-) 11 cycle integer multiply and a 72 cycle
integer divide (regardless of 32-bit or 64-bit).
(more details in gcc/config/sparc/sparc.c:{ultrasparc,ultrasparc3,niagara}_cost).
So this change has tons of merit for sparc64 chips at least :-)
Also, the multiply can parallelize with other operations but it
seems that integer divide stalls the pipe for most of the duration
of the calculation. So this makes the divide even worse.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists