linux-kernel - Re: splice/vmsplice performance test results

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20061123112429.GN4999@kernel.dk>
Date:	Thu, 23 Nov 2006 12:24:29 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Jim Schutt <jaschut@...dia.gov>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: splice/vmsplice performance test results

On Wed, Nov 22 2006, Jim Schutt wrote:
> 
> On Wed, 2006-11-22 at 09:57 +0100, Jens Axboe wrote:
> > On Tue, Nov 21 2006, Jim Schutt wrote:
> [snip]
> > > 
> > > Hmmm.  Is it worth me trying to do some sort of kernel 
> > > profiling to see if there is anything unexpected with 
> > > my setup?  If so, do you have a preference as to what 
> > > I would use?  
> > 
> > Not sure that profiling would be that interesting, as the problem
> > probably lies in where we are _not_ spending the time. But it certainly
> > can't hurt. Try to oprofile the kernel for a 10-20 sec interval while
> > the test is running. Do 3 such runs for the two test cases
> > (write-to-file, vmsplice/splice-to-file).
> > 
> 
> OK, I've attached results for 20 second profiles of three
> runs of each test: read-from-socket + write-to-file, and
> read-from-socket + vmsplice/splice-to-file.
> 
> The test case and throughput is in the name: e.g. rvs-1-306MBps
> is trial 1 of read/vmsplice/splice case, which ran at 306 MB/s.
> 
> Let me know if I can help with more testing, and thanks
> again for looking into this.

As I suspected, nothing sticks out in these logs as the problem here is
not due to a maxed out system. The traces look fairly identical, less
time spent in copy_user with the splice approach.

Comparing the generic_file_buffered_write() and splice-to-file path,
there really isn't a whole lot of difference. It would be interesting to
try and eliminate some of the differences between the two approaches -
could you try and change the vmsplice to a write-to-pipe instead? And
add SPLICE_F_NONBLOCK to the splice-to-file as well. Basically I'm
interested in a something that only really tests splice-to-file vs
write-to-file. Perhaps easier if you can just run fio to test that, I'm
inlining a job file to test that specifically.

; -- start job file

[global]
bs=64k
rw=write
overwrite=0
size=16g
end_fsync=1
direct=0
unlink

[write]
ioengine=sync

[splice]
stonewall
ioengine=splice

; -- end job file

You can grab a fio snapshot here:

http://brick.kernel.dk/snaps/fio-git-20061123122325.tar.gz

You probably want to run that a few times to see how stable the results
are, buffered io is always a little problematic from a consistency point
of view in benchmark results.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/