linux-kernel - Re: [PATCH 2/2] epoll: atomically remove wait entry on wake up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6cb1fc30-d4a1-f483-48b7-9fa594d9e46f@akamai.com>
Date:   Fri, 1 May 2020 15:27:17 -0400
From:   Jason Baron <jbaron@...mai.com>
To:     Roman Penyaev <rpenyaev@...e.de>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Khazhismel Kumykov <khazhy@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>, Heiher <r@....cc>,
        stable@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] epoll: atomically remove wait entry on wake up



On 4/30/20 9:03 AM, Roman Penyaev wrote:
> This patch does two things:
> 
> 1. fixes lost wakeup introduced by:
>   339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
> 
> 2. improves performance for events delivery.
> 
> The description of the problem is the following: if N (>1) threads
> are waiting on ep->wq for new events and M (>1) events come, it is
> quite likely that >1 wakeups hit the same wait queue entry, because
> there is quite a big window between __add_wait_queue_exclusive() and
> the following __remove_wait_queue() calls in ep_poll() function.  This
> can lead to lost wakeups, because thread, which was woken up, can
> handle not all the events in ->rdllist. (in better words the problem
> is described here: https://lkml.org/lkml/2019/10/7/905)
> 
> The idea of the current patch is to use init_wait() instead of
> init_waitqueue_entry(). Internally init_wait() sets
> autoremove_wake_function as a callback, which removes the wait entry
> atomically (under the wq locks) from the list, thus the next coming
> wakeup hits the next wait entry in the wait queue, thus preventing
> lost wakeups.
> 
> Problem is very well reproduced by the epoll60 test case [1].
> 
> Wait entry removal on wakeup has also performance benefits, because
> there is no need to take a ep->lock and remove wait entry from the
> queue after the successful wakeup. Here is the timing output of
> the epoll60 test case:
> 
>   With explicit wakeup from ep_scan_ready_list() (the state of the
>   code prior 339ddb53d373):
> 
>     real    0m6.970s
>     user    0m49.786s
>     sys     0m0.113s
> 
>  After this patch:
> 
>    real    0m5.220s
>    user    0m36.879s
>    sys     0m0.019s
> 
> The other testcase is the stress-epoll [2], where one thread consumes
> all the events and other threads produce many events:
> 
>   With explicit wakeup from ep_scan_ready_list() (the state of the
>   code prior 339ddb53d373):
> 
>     threads  events/ms  run-time ms
>           8       5427         1474
>          16       6163         2596
>          32       6824         4689
>          64       7060         9064
>         128       6991        18309
> 
>  After this patch:
> 
>     threads  events/ms  run-time ms
>           8       5598         1429
>          16       7073         2262
>          32       7502         4265
>          64       7640         8376
>         128       7634        16767
> 
>  (number of "events/ms" represents event bandwidth, thus higher is
>   better; number of "run-time ms" represents overall time spent
>   doing the benchmark, thus lower is better)
> 
> [1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
> [2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c
> 
> Signed-off-by: Roman Penyaev <rpenyaev@...e.de>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Khazhismel Kumykov <khazhy@...gle.com>
> Cc: Alexander Viro <viro@...iv.linux.org.uk>
> Cc: Heiher <r@....cc>
> Cc: Jason Baron <jbaron@...mai.com>
> Cc: stable@...r.kernel.org
> Cc: linux-fsdevel@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> ---
>  fs/eventpoll.c | 43 ++++++++++++++++++++++++-------------------
>  1 file changed, 24 insertions(+), 19 deletions(-)
> 

Looks good to me and nice speedups.
Reviewed-by: Jason Baron <jbaron@...mai.com>

Thanks,

-Jason