lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zfat19i7.fsf@yellow.woof>
Date: Wed, 17 Sep 2025 15:41:52 +0200
From: Nam Cao <namcao@...utronix.de>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner
 <brauner@...nel.org>, Jan Kara <jack@...e.cz>, Shuah Khan
 <shuah@...nel.org>, Davidlohr Bueso <dave@...olabs.net>, Soheil Hassas
 Yeganeh <soheil@...gle.com>, Khazhismel Kumykov <khazhy@...gle.com>,
 Willem de Bruijn <willemb@...gle.com>, Eric Dumazet <edumazet@...gle.com>,
 Jens Axboe <axboe@...nel.dk>, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH 2/2] eventpoll: Fix epoll_wait() report false negative

Mateusz Guzik <mjguzik@...il.com> writes:
> I'll say upfront I'm not an epoll person, just looked here out of
> curiosity.

Feedbacks always welcomed.

> I can agree there is a bug. The event is generated before any of the
> threads even exist and they only poll for it, none of them consume it.
>
> However, the commit message fails to explain why the change fixes
> anything and I think your patch de facto reverts e59d3c64cba6 ("epoll:
> eliminate unnecessary lock for zero timeout"). Looking at that diff
> the point was to avoid the expensive lock trip if timeout == 0 and there
> are no events.
>
> As for the bug is, from my reading the ep_start_scan()/ep_done_scan()
> pair transiently disturbs the state checked by ep_events_available(),
> resulting in false-negatives. Then the locked check works because by the
> time you acquire it, the damage is undone.

Exactly so. I can add this into the description.

> Given the commits referenced in Fixes:, I suspect the real fix would be
> to stop destroying that state of course.
>
> But if that's not feasible, I would check if a sequence counter around
> this would do the trick -- then the racy ep_events_available(ep) upfront
> would become safe with smaller overhead than with your proposal for the
> no-event case, but with higher overhead when there is something.
>
> My proposal is trivial to implement, I have no idea if it will get a
> buy-in though.

My question is whether the performance of epoll_wait() with zero
timeout is really that important that we have to complicate
things. If epoll_wait() with zero timeout is called repeatedly in a loop
but there is no event, I'm sure there will be measurabled performance
drop. But sane user would just use timeout in that case.

epoll's data is protected by a lock. Therefore I think the most
straightforward solution is just taking the lock before reading the
data.

Lockless is hard to get right and may cause hard-to-debug problems. So
unless this performance drop somehow bothers someone, I would prefer
"keep it simple, stupid".

Best regards,
Nam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ