[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5ea733d-b766-041b-30b9-a9a9b5167462@scylladb.com>
Date: Wed, 19 Feb 2020 12:43:51 +0200
From: Avi Kivity <avi@...lladb.com>
To: Stefan Hajnoczi <stefanha@...il.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
linux-fsdevel@...r.kernel.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org,
Davide Libenzi <davidel@...ilserver.org>,
Alexander Viro <viro@...iv.linux.org.uk>
Subject: Re: [RFC] eventfd: add EFD_AUTORESET flag
On 19/02/2020 12.37, Stefan Hajnoczi wrote:
> On Wed, Feb 12, 2020 at 12:54:30PM +0200, Avi Kivity wrote:
>> On 12/02/2020 12.47, Paolo Bonzini wrote:
>>> On 12/02/20 11:29, Stefan Hajnoczi wrote:
>>>> On Wed, Feb 12, 2020 at 09:31:32AM +0100, Paolo Bonzini wrote:
>>>>> On 29/01/20 18:20, Stefan Hajnoczi wrote:
>>>>>> + /* Semaphore semantics don't make sense when autoreset is enabled */
>>>>>> + if ((flags & EFD_SEMAPHORE) && (flags & EFD_AUTORESET))
>>>>>> + return -EINVAL;
>>>>>> +
>>>>> I think they do, you just want to subtract 1 instead of setting the
>>>>> count to 0. This way, writing 1 would be the post operation on the
>>>>> semaphore, while poll() would be the wait operation.
>>>> True! Then EFD_AUTORESET is not a fitting name. EFD_AUTOREAD or
>>>> EFD_POLL_READS?
>>> Avi's suggestion also makes sense. Switching the event loop from poll()
>>> to IORING_OP_POLL_ADD would be good on its own, and then you could make
>>> it use IORING_OP_READV for eventfds.
>>>
>>> In QEMU parlance, perhaps you need a different abstraction than
>>> EventNotifier (let's call it WakeupNotifier) which would also use
>>> eventfd but it would provide a smaller API. Thanks to the smaller API,
>>> it would not need EFD_NONBLOCK, unlike the regular EventNotifier, and it
>>> could either set up a poll() handler calling read(), or use
>>> IORING_OP_READV when io_uring is in use.
>>>
>> Just to be clear, for best performance don't use IORING_OP_POLL_ADD, just
>> IORING_OP_READ. That's what you say in the second paragraph but the first
>> can be misleading.
Actually it turns out that current uring OP_READ throws the work into a
workqueue. Jens is fixing that now.
> Thanks, that's a nice idea! I already have experimental io_uring fd
> monitoring code written for QEMU and will extend it to use IORING_OP_READ.
Note linux-aio can do IOCB_CMD_POLL, starting with 4.19.
Powered by blists - more mailing lists