lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a649974-39db-83e9-8070-d1b7f3b2a03f@kernel.dk>
Date:   Sat, 10 Dec 2022 19:20:07 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     netdev <netdev@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: [GIT PULL] Add support for epoll min wait time

On 12/10/22 6:58 PM, Jens Axboe wrote:
> On 12/10/22 11:51?AM, Linus Torvalds wrote:
>> On Sat, Dec 10, 2022 at 7:36 AM Jens Axboe <axboe@...nel.dk> wrote:
>>>
>>> This adds an epoll_ctl method for setting the minimum wait time for
>>> retrieving events.
>>
>> So this is something very close to what the TTY layer has had forever,
>> and is useful (well... *was* useful) for pretty much the same reason.
>>
>> However, let's learn from successful past interfaces: the tty layer
>> doesn't have just VTIME, it has VMIN too.
>>
>> And I think they very much go hand in hand: you want for at least VMIN
>> events or for at most VTIME after the last event.
> 
> It has been suggested before too. A more modern example is how IRQ
> coalescing works on eg nvme or nics. Those generally are of the nature
> of "wait for X time, or until Y events are available". We can certainly
> do something like that here too, it's just adding a minevents and
> passing them in together.
> 
> I'll add that, really should be trivial, and resend later in the merge
> window once we're happy with that.

Took a quick look, and it's not that trivial. The problem is you have
to wake the task to reap events anyway, this cannot be checked at
wakeup time. And now you lose the nice benefit of reducing the
context switch rate, which was a good chunk of the win here...

This can obviously very easily be done with io_uring, since that's
how it already works in terms of waiting. The min-wait part was done
separately there, though hasn't been posted or included upstream yet.

So now we're a bit stuck...

-- 
Jens Axboe


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ