lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Nov 2022 15:02:45 -0800
From:   Eric Biggers <>
To:     Paolo Abeni <>
        Soheil Hassas Yeganeh <>,
        Al Viro <>,
        Davidlohr Bueso <>,
        Jason Baron <>,
        Roman Penyaev <>,,
        Carlos Maiolino <>
Subject: Re: [PATCH v2] epoll: use refcount to reduce ep_mutex contention

On Thu, Nov 24, 2022 at 06:57:41PM +0100, Paolo Abeni wrote:
> To reduce the contention this patch introduces explicit reference counting
> for the eventpoll struct. Each registered event acquires a reference,
> and references are released at ep_remove() time. ep_free() doesn't touch
> anymore the event RB tree, it just unregisters the existing callbacks
> and drops a reference to the ep struct. The struct itself is freed when
> the reference count reaches 0. The reference count updates are protected
> by the mtx mutex so no additional atomic operations are needed.

So, the behavior before this patch is that closing an epoll file frees all
resources associated with it.  This behavior is documented in the man page
epoll_create(2): "When all file descriptors referring to an epoll instance have
been closed, the kernel destroys the instance and releases the associated
resources for reuse."

The behavior after this patch is that the resources aren't freed until the epoll
file *and* all files that were added to it have been closed.

Is that okay?  I suppose in most cases it is, since the usual use case for epoll
is to have a long-lived epoll instance and shorter lived file descriptors that
are polled using that long-lived epoll instance.

But probably some users do things the other way around.  I.e., they have a
long-lived file descriptor that is repeatedly polled using different epoll
instances that have a shorter lifetime.

In that case, the number of 'struct eventpoll' and 'struct epitem' in kernel
memory will keep growing until 'max_user_watches' is hit, at which point
EPOLL_CTL_ADD will start failing with ENOSPC.

Are you sure that is fine?

I'll also note that there is a comment at the top of fs/eventpoll.c that
describes the locking scheme, which this patch forgets to update.

- Eric

Powered by blists - more mailing lists