[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120924112712.29480.qmail@science.horizon.com>
Date: 24 Sep 2012 07:27:12 -0400
From: "George Spelvin" <linux@...izon.com>
To: linux@...izon.com, vda.linux@...glemail.com
Cc: hughd@...gle.com, linux-kernel@...r.kernel.org, mina86@...a86.com
Subject: Re: [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers.
>> +/* See comment in put_dec_full9 for choice of constants */
>> static noinline_for_stack
>> char *put_dec_full4(char *buf, unsigned q)
>> {
>> unsigned r;
>> - r = (q * 0xcccd) >> 19;
>> + r = (q * 0xccd) >> 15;
>> *buf++ = (q - 10 * r) + '0';
>> - q = (r * 0x199a) >> 16;
>> + q = (r * 0xcd) >> 11;
> I would use 16-bit shifts instead of smaller ones.
> There may be CPUs on which "get upper half of 32-bit reg"
> operation is cheaper or smaller than a shift.
Good point, but wouldn't those CPUs *also* have multi-cycle multiply,
or have to synthesize it out of shift-and-add, in which case smaller
constants would save even more cycles?
I'm thinking original MC68010 here, which I'm not sure is even
meaningful any more. ColdFire has single-cycle shifts.
Can you think of a processor where that would actually be
an improvement?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists