linux-kernel - Re: Are Linux pipes slower than the FreeBSD ones ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2c0942db0803050755u7e17118h923328fb79ee206b@mail.gmail.com>
Date:	Wed, 5 Mar 2008 07:55:53 -0800
From:	"Ray Lee" <ray-lk@...rabbit.org>
To:	"Nick Piggin" <nickpiggin@...oo.com.au>
Cc:	"Eric Dumazet" <dada1@...mosbay.com>,
	"David Miller" <davem@...emloft.net>, dmantipov@...dex.ru,
	linux-kernel@...r.kernel.org
Subject: Re: Are Linux pipes slower than the FreeBSD ones ?

On Wed, Mar 5, 2008 at 7:38 AM, Nick Piggin <nickpiggin@...oo.com.au> wrote:
>
> On Thursday 06 March 2008 01:55, Eric Dumazet wrote:
>  > Nick Piggin a écrit :
>  > > On Wednesday 05 March 2008 20:47, Eric Dumazet wrote:
>  > >> David Miller a écrit :
>  > >>> From: Antipov Dmitry <dmantipov@...dex.ru>
>  > >>> Date: Wed, 05 Mar 2008 10:46:57 +0300
>  > >>>
>  > >>>> Despite of this obvious fact, recently I've tried to compare pipe
>  > >>>> performance on Linux and FreeBSD systems. Unfortunately, Linux
>  > >>>> results are poor - ~2x slower than FreeBSD. The detailed description
>  > >>>> of the test case, preparation, environment and results are located
>  > >>>> at http://213.148.29.37/PipeBench, and everyone are pleased to look
>  > >>>> at, reproduce, criticize, etc.
>  > >>>
>  > >>> FreeBSD does page flipping into the pipe receiver, so rerun your test
>  > >>> case but have either the sender or the receiver make changes to
>  > >>> their memory buffer in between the read/write calls.
>  > >>>
>  > >>> FreeBSD's scheme is only good for benchmarks, rather then real life.
>  > >>
>  > >> page flipping might explain differences for big transferts, but note the
>  > >> difference with small buffers (64, 128, 256, 512 bytes)
>  > >>
>  > >> I tried the 'pipe' prog on a fresh linux-2.6.24.2, on a dual Xeon 5120
>  > >> machine, and we can notice that four cpus are used (but only two threads
>  > >> are running on this benchmark)
>  > >
>  > > One thing to try is pinning both processes on the same CPU. This
>  > > may be what the FreeBSD scheduler is preferring to do, and it ends
>  > > up being really a tradeoff that helps some workloads and hurts
>  > > others. With a very unscientific test with an old kernel, the
>  > > pipe.c test gets anywhere from about 1.5 to 3 times faster when
>  > > running it as taskset 1 ./pipe
>  > >
>  > >> # opreport -l /boot/vmlinux-2.6.24.2 |head -n 30
>  > >> CPU: Core 2, speed 1866.8 MHz (estimated)
>  > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
>  > >> unit mask of 0x00 (Unhalted core cycles) count 100000
>  > >> samples  %        symbol name
>  > >> 52137     9.3521  kunmap_atomic
>  > >
>  > > I wonder if FreeBSD doesn't allocate their pipe buffers from kernel
>  > > addressable memory. We could do this to eliminate the cost completely
>  > > on highmem systems (whether it is a good idea I don't know, normally
>  > > you'd actually do a bit of work between reading or writing from a
>  > > pipe...)
>  > >
>  > >> 50983     9.1451  mwait_idle_with_hints
>  > >> 50448     9.0492  system_call
>  > >> 49727     8.9198  task_rq_lock
>  > >> 24531     4.4003  pipe_read
>  > >> 19820     3.5552  pipe_write
>  > >> 16176     2.9016  dnotify_parent
>  > >
>  > > Just say no to dnotify.
>  > >
>  > >> 15455     2.7723  file_update_time
>  > >
>  > > Dumb question: anyone know why pipe.c calls this?
>  >
>  > Because pipe writer calls write() syscall -> file_update_time() in kernel
>  > while pipe reader calls read() syscall -> touch_atime() in kernel
>
>  Yeah, but why does the pipe inode need to have its times updated?
>  I guess there is some reason... hopefully not C&P related.

In principle so that the reader or writer can find out the last time
the other end did any processing of the pipe. And yeah, for POSIX
compliance: "Upon successful completion, pipe() will mark for update
the st_atime, st_ctime and st_mtime fields of the pipe. " But it'd be
nice if there were a way to avoid touching it more than once a second
(note the 'will mark for update' language). Or if the pipe is a
physical FIFO on a noatime filesystem?