[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3878dfd-67f7-9a01-8dcf-7202bf5f3918@kernel.dk>
Date: Fri, 20 May 2022 09:32:34 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Al Viro <viro@...iv.linux.org.uk>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: gregkh@...uxfoundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] char/mem: only use {read,write}_iter, not the old
{read,write} functions
On 5/20/22 9:11 AM, Jens Axboe wrote:
> On 5/20/22 9:09 AM, Al Viro wrote:
>> On Fri, May 20, 2022 at 03:50:30PM +0200, Jason A. Donenfeld wrote:
>>> Currently mem.c implements both the {read,write}_iter functions and the
>>> {read,write} functions. But with {read,write} going away at some point
>>> in the future,
>>
>> Not likely to happen, unfortunately.
>>
>>> and most kernel code made to prefer {read,write}_iter,
>>> there's no point in keeping around the old code.
>>
>> Profile and you'll see ;-/
>
> Weren't you working on bits to get us to performance parity there?
> What's the status of that?
Totally unscientific test on the current kernel, running:
dd if=/dev/zero of=/dev/null bs=4k status=progress
With the current tree, I get 8.8GB/sec, and if I drop fops->read() for
/dev/zero, then I get 8.6GB/sec. That's 1%, which isn't nothing, but
it's also not a huge loss for moving us in the right direction.
Looking at a perf diff, it's mostly:
+0.34% [kernel.kallsyms] [k] new_sync_read
+0.33% [kernel.kallsyms] [k] init_sync_kiocb
+0.07% [kernel.kallsyms] [k] iov_iter_init
+0.80% [kernel.kallsyms] [k] iov_iter_zero
with these being gone after switch to ->read_iter():
0.63% [kernel.kallsyms] [k] read_zero
0.13% [kernel.kallsyms] [k] __clear_user
Didn't look closer, but I'm assuming this is _mostly_ tied to needing to
init 48 bytes of kiocb for each one. There might be ways to embed a
sync_kiocb inside the kiocb for the bits we need there, at least that
could get us down to 32 bytes.
> It really is an unfortunate situation we're currently in with two
> methods for either read or write, with one being greatly preferred as we
> can pass in non-file associated state (like IOCB_NOWAIT, etc) but the
> older variant being a bit faster. It lives us in a bad place, imho.
And splice etc, for example...
--
Jens Axboe
Powered by blists - more mailing lists