[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2c0942db0803050755u7e17118h923328fb79ee206b@mail.gmail.com>
Date: Wed, 5 Mar 2008 07:55:53 -0800
From: "Ray Lee" <ray-lk@...rabbit.org>
To: "Nick Piggin" <nickpiggin@...oo.com.au>
Cc: "Eric Dumazet" <dada1@...mosbay.com>,
"David Miller" <davem@...emloft.net>, dmantipov@...dex.ru,
linux-kernel@...r.kernel.org
Subject: Re: Are Linux pipes slower than the FreeBSD ones ?
On Wed, Mar 5, 2008 at 7:38 AM, Nick Piggin <nickpiggin@...oo.com.au> wrote:
>
> On Thursday 06 March 2008 01:55, Eric Dumazet wrote:
> > Nick Piggin a écrit :
> > > On Wednesday 05 March 2008 20:47, Eric Dumazet wrote:
> > >> David Miller a écrit :
> > >>> From: Antipov Dmitry <dmantipov@...dex.ru>
> > >>> Date: Wed, 05 Mar 2008 10:46:57 +0300
> > >>>
> > >>>> Despite of this obvious fact, recently I've tried to compare pipe
> > >>>> performance on Linux and FreeBSD systems. Unfortunately, Linux
> > >>>> results are poor - ~2x slower than FreeBSD. The detailed description
> > >>>> of the test case, preparation, environment and results are located
> > >>>> at http://213.148.29.37/PipeBench, and everyone are pleased to look
> > >>>> at, reproduce, criticize, etc.
> > >>>
> > >>> FreeBSD does page flipping into the pipe receiver, so rerun your test
> > >>> case but have either the sender or the receiver make changes to
> > >>> their memory buffer in between the read/write calls.
> > >>>
> > >>> FreeBSD's scheme is only good for benchmarks, rather then real life.
> > >>
> > >> page flipping might explain differences for big transferts, but note the
> > >> difference with small buffers (64, 128, 256, 512 bytes)
> > >>
> > >> I tried the 'pipe' prog on a fresh linux-2.6.24.2, on a dual Xeon 5120
> > >> machine, and we can notice that four cpus are used (but only two threads
> > >> are running on this benchmark)
> > >
> > > One thing to try is pinning both processes on the same CPU. This
> > > may be what the FreeBSD scheduler is preferring to do, and it ends
> > > up being really a tradeoff that helps some workloads and hurts
> > > others. With a very unscientific test with an old kernel, the
> > > pipe.c test gets anywhere from about 1.5 to 3 times faster when
> > > running it as taskset 1 ./pipe
> > >
> > >> # opreport -l /boot/vmlinux-2.6.24.2 |head -n 30
> > >> CPU: Core 2, speed 1866.8 MHz (estimated)
> > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
> > >> unit mask of 0x00 (Unhalted core cycles) count 100000
> > >> samples % symbol name
> > >> 52137 9.3521 kunmap_atomic
> > >
> > > I wonder if FreeBSD doesn't allocate their pipe buffers from kernel
> > > addressable memory. We could do this to eliminate the cost completely
> > > on highmem systems (whether it is a good idea I don't know, normally
> > > you'd actually do a bit of work between reading or writing from a
> > > pipe...)
> > >
> > >> 50983 9.1451 mwait_idle_with_hints
> > >> 50448 9.0492 system_call
> > >> 49727 8.9198 task_rq_lock
> > >> 24531 4.4003 pipe_read
> > >> 19820 3.5552 pipe_write
> > >> 16176 2.9016 dnotify_parent
> > >
> > > Just say no to dnotify.
> > >
> > >> 15455 2.7723 file_update_time
> > >
> > > Dumb question: anyone know why pipe.c calls this?
> >
> > Because pipe writer calls write() syscall -> file_update_time() in kernel
> > while pipe reader calls read() syscall -> touch_atime() in kernel
>
> Yeah, but why does the pipe inode need to have its times updated?
> I guess there is some reason... hopefully not C&P related.
In principle so that the reader or writer can find out the last time
the other end did any processing of the pipe. And yeah, for POSIX
compliance: "Upon successful completion, pipe() will mark for update
the st_atime, st_ctime and st_mtime fields of the pipe. " But it'd be
nice if there were a way to avoid touching it more than once a second
(note the 'will mark for update' language). Or if the pipe is a
physical FIFO on a noatime filesystem?
Powered by blists - more mailing lists