[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070531124129.31c14ddd.dada1@cosmosbay.com>
Date: Thu, 31 May 2007 12:41:29 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Davide Libenzi <davidel@...ilserver.org>,
Ulrich Drepper <drepper@...hat.com>,
Jeff Garzik <jeff@...zik.org>,
Zach Brown <zach.brown@...cle.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@....com.au>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Evgeniy Polyakov <johnpol@....mipt.ru>,
"David S. Miller" <davem@...emloft.net>,
Suparna Bhattacharya <suparna@...ibm.com>,
Jens Axboe <jens.axboe@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Syslets, Threadlets, generic AIO support, v6
On Thu, 31 May 2007 11:02:52 +0200
Ingo Molnar <mingo@...e.hu> wrote:
>
> * Ingo Molnar <mingo@...e.hu> wrote:
>
> > it's both a flexibility and a speedup thing as well:
> >
> > flexibility: for libraries to be able to open files and keep them open
> > comes up regularly. For example currently glibc is quite wasteful in a
> > number of common networking related functions (Ulrich, please correct
> > me if i'm wrong), which could be optimized if glibc could just keep a
> > netlink channel fd open and could poll() it for changes and cache the
> > results if there are no changes (or something like that).
> >
> > speedup: i suggested O_ANY 6 years ago as a speedup to Apache -
> > non-linear fds are cheaper to allocate/map:
> >
> > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg23820.html
> >
> > (i definitely remember having written code for that too, but i cannot
> > find that in the archives. hm.) In theory we could avoid _all_
> > fd-bitmap overhead as well and use a per-process list/pool of struct
> > file buffers plus a maximum-fd field as the 'non-linear fd allocator'
> > (at the price of only deallocating them at process exit time).
>
> to measure this i've written fd-scale-bench.c:
>
> http://redhat.com/~mingo/fd-scale-patches/fd-scale-bench.c
>
> which tests the (cache-hot or cache-cold) cost of open()-ing of two fds
> while there are N other fds already open: one is from the 'middle' of
> the range, one is from the end of it.
>
> Lets check our current 'extreme high end' performance with 1 million
> fds. (which is not realistic right now but there certainly are systems
> with over a hundred thousand open fds). Results from a fast CPU with 2MB
> of cache:
>
> cache-hot:
>
> # ./fd-scale-bench 1000000 0
> checking the cache-hot performance of open()-ing 1000000 fds.
> num_fds: 1, best cost: 1.40 us, worst cost: 2.00 us
> num_fds: 2, best cost: 1.40 us, worst cost: 1.40 us
> num_fds: 3, best cost: 1.40 us, worst cost: 2.00 us
> num_fds: 4, best cost: 1.40 us, worst cost: 1.40 us
> ...
> num_fds: 77117, best cost: 1.60 us, worst cost: 2.00 us
> num_fds: 96397, best cost: 2.00 us, worst cost: 2.20 us
> num_fds: 120497, best cost: 2.20 us, worst cost: 2.40 us
> num_fds: 150622, best cost: 2.20 us, worst cost: 3.00 us
> num_fds: 188278, best cost: 2.60 us, worst cost: 3.00 us
> num_fds: 235348, best cost: 2.80 us, worst cost: 3.80 us
> num_fds: 294186, best cost: 3.40 us, worst cost: 4.20 us
> num_fds: 367733, best cost: 4.00 us, worst cost: 5.00 us
> num_fds: 459667, best cost: 4.60 us, worst cost: 6.00 us
> num_fds: 574584, best cost: 5.60 us, worst cost: 8.20 us
> num_fds: 718231, best cost: 6.40 us, worst cost: 10.00 us
> num_fds: 897789, best cost: 7.60 us, worst cost: 11.80 us
> num_fds: 1000000, best cost: 8.20 us, worst cost: 9.60 us
>
> cache-cold:
>
> # ./fd-scale-bench 1000000 1
> checking the performance of open()-ing 1000000 fds.
> num_fds: 1, best cost: 4.60 us, worst cost: 7.00 us
> num_fds: 2, best cost: 5.00 us, worst cost: 6.60 us
> ...
> num_fds: 77117, best cost: 5.60 us, worst cost: 7.40 us
> num_fds: 96397, best cost: 5.60 us, worst cost: 7.40 us
> num_fds: 120497, best cost: 6.20 us, worst cost: 6.80 us
> num_fds: 150622, best cost: 6.40 us, worst cost: 7.60 us
> num_fds: 188278, best cost: 6.80 us, worst cost: 9.20 us
> num_fds: 235348, best cost: 7.20 us, worst cost: 8.80 us
> num_fds: 294186, best cost: 8.00 us, worst cost: 9.40 us
> num_fds: 367733, best cost: 8.80 us, worst cost: 11.60 us
> num_fds: 459667, best cost: 9.20 us, worst cost: 12.20 us
> num_fds: 574584, best cost: 10.00 us, worst cost: 12.40 us
> num_fds: 718231, best cost: 11.00 us, worst cost: 13.40 us
> num_fds: 897789, best cost: 12.80 us, worst cost: 15.80 us
> num_fds: 1000000, best cost: 13.60 us, worst cost: 15.40 us
>
> we are pretty good at the moment: the open() cost starts to increase at
> around 100K open fds, both in the cache-cold and cache-hot case. (that
> roughly corresponds to the fd bitmap falling out of the 32K L1 cache) At
> 1 million fds our fd bitmap has a size of 128K when there are 1 million
> fds open in a single process.
>
> so while it's certainly not 'urgent' to improve this, private fds are an
> easier target for optimizations in this area, because they dont have the
> continuity requirement anymore, so the fd bitmap is not a 'forced'
> property of them.
Your numbers do not match mines (mines were more than two years old so I redid a test before replying)
I tried your bench and found two problems :
- You scan half of the bitmap
- You incorrectlty divide best_delta and worst_delta by LOOPS (5)
Try to close not a 'middle fd', but a really low one (10 for example), and latencie is doubled.
with a corrected bench; cache-cold numbers are > 100 us on this Intel Pentium-M
num_fds: 1000000, best cost: 120.00 us, worst cost: 131.00 us
On an Opteron x86_64 machine, results are better :)
num_fds: 1000000, best cost: 28.00 us, worst cost: 106.00 us
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists