[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201109300858.28633.kernel@kolivas.org>
Date: Fri, 30 Sep 2011 08:58:28 +1000
From: Con Kolivas <kernel@...ivas.org>
To: Willy Tarreau <w@....eu>
Cc: Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org
Subject: Re: BFS cpu scheduler and skip list implementation
On Fri, 30 Sep 2011 06:45:05 Willy Tarreau wrote:
> Hi Con,
>
> On Sat, Sep 24, 2011 at 06:38:06PM +1000, Con Kolivas wrote:
> > That's great then. I'm sure we'd explode in other weird and wonderful
> > ways before the CPU load ever got to 64k. Plus all that would happen is
> > that it would start degenerating from O(log n) insertion to O(n) as the
> > number way surpassed 64k. The number 16 for levels was simply chosen as
> > the one originally used by William Pugh in his sample code, but seems to
> > be ample for this application.
>
> If you're interested, during the early CFS benchmarks a few years ago, I
> reworked my old binary tree code to make it kernel-compatible. By this, I
> mean that it never needs to allocate memory, it's used just like rbtrees
> or kernel lists, by having a node in your structure and inserting it from
> the root of a tree. It offers the following features :
> - O(log(N)) insertion/lookup
> - O(1) removal
> - O(1) next/prev walk
> - 20 bytes per node on 32-bit pointers, 36-bytes on 64-bit pointers, plus
> the key
> - supports unique or multiple occurrences of the same key (walked in
> insertion order and classed in trees)
> - supports 32/64 bit signed/unsigned integers, strings and memory blocks
> - supports prefixes (eg. to insert IP addresses with masks)
> - supports lookup of greater than or equal to, less than or equal to.
>
> I replaced the rbtree that was used in haproxy's scheduler with this new
> code and measured a noticeable performance improvement, since haproxy
> does insert/next/remove a lot of times a second, and not having to balance
> a tree saves a huge number of cycles.
>
> I remember having conducted some tests on CFS a log time ago with it, but
> the only cases where I observed a gain was when running insane amounts of
> tasks at insane context switching rates, which was biased and irrealistic.
> So I stopped trying to put it into the kernel at that time.
>
> Maybe for your usage it might bring some value. Take a look at the code
> here if you want, it's not too much documented but still enough to start
> with it. There are a few examples that can help get it right. I know that
> a few people use it for their own projects and they did not ask for help
> :-)
>
> http://git.1wt.eu/web?p=ebtree.git;a=summary
Thanks Willy that's most interesting.
In fact after I fixed a few bugs in the original implementation I posted here,
and cleaned up the code to not do dynamic memory allocation in the hot
scheduler path and so on... I found it (my skip list implementation) to be
universally slower than the existing BFS approach of simple linked lists and
walking the list for lookup. Any performance advantage in the original version
was coincidence due to the way it performed on one workload and was bad in
others. Rather disappointing but you can read my thoughts here
http://ck-hack.blogspot.com/
Anyway I'll make sure to read your code and see if it has anything to offer
me. Thanks for the pointer.
Regards,
Con
--
-ck
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists