[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <494f31b8e37b44d1a24e28885188f16e@AcuMS.aculab.com>
Date: Tue, 4 May 2021 08:07:33 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Jens Axboe' <axboe@...nel.dk>,
Matthew Wilcox <willy@...radead.org>
CC: "viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] eventfd: convert to using ->write_iter()
From: Jens Axboe
> Sent: 03 May 2021 19:05
>
> On 5/3/21 12:02 PM, Matthew Wilcox wrote:
> > On Mon, May 03, 2021 at 11:57:08AM -0600, Jens Axboe wrote:
> >> On 5/3/21 10:12 AM, David Laight wrote:
> >>> From: Jens Axboe
> >>>> Sent: 03 May 2021 15:58
> >>>>
> >>>> Had a report on writing to eventfd with io_uring is slower than it
> >>>> should be, and it's the usual case of if a file type doesn't support
> >>>> ->write_iter(), then io_uring cannot rely on IOCB_NOWAIT being honored
> >>>> alongside O_NONBLOCK for whether or not this is a non-blocking write
> >>>> attempt. That means io_uring will punt the operation to an io thread,
> >>>> which will slow us down unnecessarily.
> >>>>
> >>>> Convert eventfd to using fops->write_iter() instead of fops->write().
> >>>
> >>> Won't this have a measurable performance degradation on normal
> >>> code that does write(event_fd, &one, 4);
> >>
> >> If ->write_iter() or ->read_iter() is much slower than the non-iov
> >> versions, then I think we have generic issues that should be solved.
> >
> > We do!
> >
> > https://lore.kernel.org/linux-fsdevel/20210107151125.GB5270@casper.infradead.org/
> > is one thread on it. There have been others.
>
> But then we really must get that fixed, imho ->read() and ->write()
> should go away, and if the iter variants are 10% slower, then that should
> get fixed up.
I think there are two separate issues.
(Although I've not looked in detail into the really bad cases.)
1) I suspect some of the fs code is using entirely different paths for the
'single fragment' and 'iter' variants.
2) For trivial drivers the cost of setting up the iov_iter[] and then
iterating it becomes significant (or at least measurable).
I haven't tried to undo the morass of #defines in the iter code.
But I suspect they could be optimised for the common case of
copying an entire single-fragment to/from userspace in one call.
Not related to this code path, but I've some patches that give a
few % speedup for writev() to /dev/null.
That is all about copying the iov[] from user - it doesn't get 'iterated'.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists