[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1158166a0708091301q1eca67ct2ef3fa81e3878510@mail.gmail.com>
Date: Thu, 9 Aug 2007 21:01:45 +0100
From: "Denis Vlasenko" <vda.linux@...glemail.com>
To: "Arjan van de Ven" <arjan@...radead.org>
Cc: "Roman Zippel" <zippel@...ux-m68k.org>,
"Jonathan Corbet" <corbet@....net>, linux-kernel@...r.kernel.org,
"Thomas Gleixner" <tglx@...utronix.de>, akpm@...ux-foundation.org,
"Ingo Molnar" <mingo@...e.hu>
Subject: Re: [PATCH] msleep() with hrtimers
On 8/9/07, Denis Vlasenko <vda.linux@...glemail.com> wrote:
> On 8/8/07, Arjan van de Ven <arjan@...radead.org> wrote:
> > You keep claiming that hrtimers are so incredibly expensive; but for
> > msleep()... which is mostly called during driver init ... I really don't
> > buy that it's really expensive. We're not doing this a gazilion times
> > per second obviously...
>
> Yes. Optimizing delay or sleep functions for speed is a contradiction
> of terms. IIRC we still optimize udelay for speed, not code size...
> Read it again folks:
>
> We optimize udelay for speed
>
> How fast your udelay do you want to be today?
Just checked. i386 and x86-64 seems to be sane - udelay() and ndelay()
are not inlined.
Several arches are still frantically try to make udelay faster. Many have
the same comment:
/*
* Use only for very small delays ( < 1 msec). Should probably use a
* lookup table, really, as the multiplications take much too long with
* short delays. This is a "reasonable" implementation, though (and the
* first constant multiplications gets optimized away if the delay is
* a constant)
*/
and thus seem to be a cut-n-paste code.
BTW, almost all arched have __const_udelay(N) which obviously
does not delay for N usecs:
#define udelay(n) (__builtin_constant_p(n) ? \
((n) > 20000 ? __bad_udelay() : __const_udelay((n) * 0x10c7ul)) : \
__udelay(n))
Bad name.
Are patches which de-inline udelay and do s/__const_udelay/__const_delay/g
be accepted?
Arches with udelay's still inlined are below. mips is especially big.
frv has totally bogus ndelay().
include/asm-ppc/delay.h
extern __inline__ void __udelay(unsigned int x)
{
unsigned int loops;
__asm__("mulhwu %0,%1,%2" : "=r" (loops) :
"r" (x), "r" (loops_per_jiffy * 226));
__delay(loops);
}
include/asm-parisc/delay.h
static __inline__ void __udelay(unsigned long usecs) {
__cr16_delay(usecs * ((unsigned long)boot_cpu_data.cpu_hz / 1000000UL));
}
include/asm-mips/delay.h
static inline void __udelay(unsigned long usecs, unsigned long lpj)
{
unsigned long lo;
/*
* The rates of 128 is rounded wrongly by the catchall case
* for 64-bit. Excessive precission? Probably ...
*/
#if defined(CONFIG_64BIT) && (HZ == 128)
usecs *= 0x0008637bd05af6c7UL; /* 2**64 / (1000000 / HZ) */
#elif defined(CONFIG_64BIT)
usecs *= (0x8000000000000000UL / (500000 / HZ));
#else /* 32-bit junk follows here */
usecs *= (unsigned long) (((0x8000000000000000ULL / (500000 / HZ)) +
0x80000000ULL) >> 32);
#endif
if (sizeof(long) == 4)
__asm__("multu\t%2, %3"
: "=h" (usecs), "=l" (lo)
: "r" (usecs), "r" (lpj)
: GCC_REG_ACCUM);
else if (sizeof(long) == 8)
__asm__("dmultu\t%2, %3"
: "=h" (usecs), "=l" (lo)
: "r" (usecs), "r" (lpj)
: GCC_REG_ACCUM);
__delay(usecs);
}
include/asm-m68k/delay.h
static inline void __udelay(unsigned long usecs)
{
__const_udelay(usecs * 4295); /* 2**32 / 1000000 */
}
include/asm-h8300/delay.h
static inline void udelay(unsigned long usecs)
{
usecs *= 4295; /* 2**32 / 1000000 */
usecs /= (loops_per_jiffy*HZ);
if (usecs)
__delay(usecs);
}
include/asm-frv/delay.h
static inline void udelay(unsigned long usecs)
{
__delay(usecs * __delay_loops_MHz);
}
#define ndelay(n) udelay((n) * 5)
include/asm-xtensa/delay.h
static __inline__ void udelay (unsigned long usecs)
{
unsigned long start = xtensa_get_ccount();
unsigned long cycles = usecs * (loops_per_jiffy / (1000000UL / HZ));
/* Note: all variables are unsigned (can wrap around)! */
while (((unsigned long)xtensa_get_ccount()) - start < cycles)
;
}
include/asm-v850/delay.h
static inline void udelay(unsigned long usecs)
{
register unsigned long full_loops, part_loops;
full_loops = ((usecs * HZ) / 1000000) * loops_per_jiffy;
usecs %= (1000000 / HZ);
part_loops = (usecs * HZ * loops_per_jiffy) / 1000000;
__delay(full_loops + part_loops);
}
include/asm-cris/delay.h
static inline void udelay(unsigned long usecs)
{
__delay(usecs * loops_per_usec);
}
include/asm-blackfin/delay.h
static inline void udelay(unsigned long usecs)
{
extern unsigned long loops_per_jiffy;
__delay(usecs * loops_per_jiffy / (1000000 / HZ));
}
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists