lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+8F9hhy=WPMJLQ3Ya_w4O6xyWk7KsXi=YJofmyC577_UJTutA@mail.gmail.com>
Date:   Sat, 28 Mar 2020 11:10:28 -0700
From:   Omar Kilani <omar.kilani@...il.com>
To:     linux-kernel@...r.kernel.org
Subject: Weird issue with epoll and kernel >= 5.0

Hi there,

I've observed an issue with epoll and kernels 5.0 and above when a
system is generating a lot of epoll events.

I see this issue with nginx and jvm / netty based apps (using the
jvm's native epoll support as well as netty's own optimized epoll
support) but *not* with haproxy (?).

I'm not really sure what the actual problem is (nginx complains about
epoll_wait with a generic error), but it doesn't happen on 4.19.x and
lower.

I thought it was a netty problem at first and opened this ticket:

https://github.com/netty/netty/issues/8999

But then saw the same issue in nginx.

I haven't debugged a kernel issue in something like 20 years so I'm
not really sure where to start myself.

I'd be more than happy to provide my test case that has a very quick
repro to anyone who needs it.

Also happy to provide a VM/machine with enough CPUs to trigger it
easily (it seems to happen quicker with more CPUs present) to test
with.

Thanks!

Regards,
Omar

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ