[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201008060558.59019.vda.linux@googlemail.com>
Date: Fri, 6 Aug 2010 05:58:58 +0200
From: Denys Vlasenko <vda.linux@...glemail.com>
To: Michal Nazarewicz <mina86@...a86.com>
Cc: linux-kernel@...r.kernel.org, m.nazarewicz@...sung.com,
"Douglas W. Jones" <jones@...uiowa.edu>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/3] lib: vsprintf: optimised put_dec_trunc() and put_dec_full()
On Friday 06 August 2010 00:38, Michal Nazarewicz wrote:
> The put_dec_trunc() and put_dec_full() functions were based on
> a code optimised for processors with 8-bit ALU but even then
> they failed to satisfy the same constraints
"Failed"? Interesting wording. Yes, the code won't map easily
onto 8-bit ALU, for the simple reason Linux kernel
does not support any 8-bit CPUs, and by going to wider register
I was able to process 5 decimal digits at once, not 4.
It was done deliberately. It is not a "failure".
Your code isn't 8-bit ALU optimized either.
Do you think a bit of smear of previous code
would help your to be accepted?
> and in fact
> required at least 16-bit ALU (because at least one number they
> operate in can take 9 bits).
Yes, as explained above.
> This version of those functions proposed by this patch goes
> further and uses the full capacity of a 32-bit ALU and instead
> of splitting the number into nibbles and operating on them it
> performs the obvious algorithm for base conversion expect it
> uses optimised code for dividing by ten (ie. no division is
> actually performed).
(1) "expect" is a typo
(2) No, _this_ patch does not eliminate division. Next one does.
Move this part of changelong to the next patch, where it belongs.
> + * Decimal conversion is by far the most typical, and is used for
> + * /proc and /sys data. This directly impacts e.g. top performance
> + * with many processes running.
> + *
> + * We optimize it for speed using ideas described at
> + * <http://www.cs.uiowa.edu/~jones/bcd/divide.html>.
Do you have author's permission to do it?
Document it in the comment please.
> + * '(num * 0xcccd) >> 19' is an approximation of 'num / 10' that gives
> + * correct results for num < 81920. Because of this, we check at the
> + * beginning if we are dealing with a number that may cause trouble
> + * and if so, we make it smaller.
This comment needs to be moved to the code line where the opration
is performed.
> + * (As a minor note, all operands are always 16 bit so this function
> + * should work well on hardware that cannot multiply 32 bit numbers).
> + *
> + * (Previous a code based on
English is a bit broken in the line above.
--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists