linux-kernel - Re: [RFC PATCH] fuse: support splice() reading from fuse device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100520175844.GW25951@kernel.dk>
Date:	Thu, 20 May 2010 19:58:44 +0200
From:	Jens Axboe <axboe@...nel.dk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Miklos Szeredi <miklos@...redi.hu>, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org
Subject: Re: [RFC PATCH] fuse: support splice() reading from fuse device

On Thu, May 20 2010, Linus Torvalds wrote:
> 
> 
> On Thu, 20 May 2010, Miklos Szeredi wrote:
> > 
> > With Jens' pipe growing patch and additional fuse patches it was
> > possible to achieve a 20GBytes/s write throghput on my laptop in a
> > "null" filesystem (no page cache, data goes to /dev/null).
> 
> Btw, I don't think that is a very interesting benchmark.
> 
> The reason I say that is that many man years ago I played with doing 
> zero-copy pipe read/write system calls (no splice, just automatic "follow 
> the page tables, mark things read-only etc" things). It was considered 
> sexy to do things like that during the mid-90's - there were all the crazy 
> ukernel people with Mach etc doing magic things with moving pages around.
> 
> It got me a couple of gigabytes per second back then (when memcpy() speeds 
> were in the tens of megabytes) on benchmarks like lmbench that just wrote 
> the same buffer over and over again without ever touching the data.
> 
> It was totally worthless on _any_ real load. In fact, it made things 
> worse. I never found a single case where it helped.
> 
> So please don't ever benchmark things that don't make sense, and then use 
> the numbers as any kind of reason to do anything. It's worse than 
> worthless. It actually adds negative value to show "look ma, no hands" for 
> things that nobody does. It makes people think it's a good idea, and 
> optimizes the wrong thing entirely.
> 
> Are there actual real loads that get improved? I don't care if it means 
> that the improvement goes from three orders of magnitude to just a couple 
> of percent. The "couple of percent on actual loads" is a lot more 
> important than "many orders of magnitude on a made-up benchmark".

I agree on the basis that these types of benchmarks are fine to validate
the "are there stupid problems in the new code?" question, but not so
much as a comparison for anything.

I can easily run some pure IO benchmarks and send some numbers comparing
64KB vs 1MB pipes on splice. It's been a while since I did that, if I
recall correctly then the biggest issue I ran into back then was beating
on the inode mutex a lot and not so much the actual syscall entry/exit
count being a multiple of comparison tests with read/write using a
larger buffer.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/