lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a2a88f4f-d104-f565-4d6e-1dddc7f79a05@kernel.dk>
Date:   Fri, 31 May 2019 08:48:55 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Roman Penyaev <rpenyaev@...e.de>
Cc:     Azat Khuzhin <azat@...event.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 00/13] epoll: support pollable epoll from userspace

On 5/16/19 2:57 AM, Roman Penyaev wrote:
> Hi all,
> 
> This is v3 which introduces pollable epoll from userspace.
> 
> v3:
>   - Measurements made, represented below.
> 
>   - Fix alignment for epoll_uitem structure on all 64-bit archs except
>     x86-64. epoll_uitem should be always 16 bit, proper BUILD_BUG_ON
>     is added. (Linus)
> 
>   - Check pollflags explicitly on 0 inside work callback, and do nothing
>     if 0.
> 
> v2:
>   - No reallocations, the max number of items (thus size of the user ring)
>     is specified by the caller.
> 
>   - Interface is simplified: -ENOSPC is returned on attempt to add a new
>     epoll item if number is reached the max, nothing more.
> 
>   - Alloced pages are accounted using user->locked_vm and limited to
>     RLIMIT_MEMLOCK value.
> 
>   - EPOLLONESHOT is handled.
> 
> This series introduces pollable epoll from userspace, i.e. user creates
> epfd with a new EPOLL_USERPOLL flag, mmaps epoll descriptor, gets header
> and ring pointers and then consumes ready events from a ring, avoiding
> epoll_wait() call.  When ring is empty, user has to call epoll_wait()
> in order to wait for new events.  epoll_wait() returns -ESTALE if user
> ring has events in the ring (kind of indication, that user has to consume
> events from the user ring first, I could not invent anything better than
> returning -ESTALE).
> 
> For user header and user ring allocation I used vmalloc_user().  I found
> that it is much easy to reuse remap_vmalloc_range_partial() instead of
> dealing with page cache (like aio.c does).  What is also nice is that
> virtual address is properly aligned on SHMLBA, thus there should not be
> any d-cache aliasing problems on archs with vivt or vipt caches.

Why aren't we just adding support to io_uring for this instead? Then we
don't need yet another entirely new ring, that's is just a little
different from what we have.

I haven't looked into the details of your implementation, just curious
if there's anything that makes using io_uring a non-starter for this
purpose?

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ