[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9624.1489260500@warthog.procyon.org.uk>
Date: Sat, 11 Mar 2017 19:28:20 +0000
From: David Howells <dhowells@...hat.com>
To: Eric Biggers <ebiggers3@...il.com>
Cc: dhowells@...hat.com, linux-fsdevel@...r.kernel.org,
Al Viro <viro@...iv.linux.org.uk>,
linux-kernel@...r.kernel.org, Eric Biggers <ebiggers@...gle.com>
Subject: Re: [PATCH] statx: optimize copy of struct statx to userspace
Eric Biggers <ebiggers3@...il.com> wrote:
> From: Eric Biggers <ebiggers@...gle.com>
>
> I found that statx() was significantly slower than stat(). As a
> microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> file to the same with statx() passed a NULL path:
>
> $ time ./stat_benchmark
>
> real 0m1.464s
> user 0m0.275s
> sys 0m1.187s
>
> $ time ./statx_benchmark
>
> real 0m5.530s
> user 0m0.281s
> sys 0m5.247s
>
> statx is expected to be a little slower than stat because struct statx
> is larger than struct stat, but not by *that* much. It turns out that
> most of the overhead was in copying struct statx to userspace,
> apparently mostly in all the stac/clac instructions that got generated
> for each __put_user() call. (This was on x86_64, but some other
> architectures, e.g. arm64, have something similar now too.)
>
> stat() instead initializes its struct on the stack and copies it to
> userspace with a single call to copy_to_user(). This turns out to be
> much faster, and changing statx to do this makes it almost as fast as
> stat:
>
> $ time ./statx_benchmark
>
> real 0m1.573s
> user 0m0.229s
> sys 0m1.344s
>
> Signed-off-by: Eric Biggers <ebiggers@...gle.com>
Acked-by: David Howells <dhowells@...hat.com>
Powered by blists - more mailing lists