lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250528080432.Qke-VMIY@linutronix.de>
Date: Wed, 28 May 2025 10:04:32 +0200
From: Nam Cao <namcao@...utronix.de>
To: Holger Hoffstätte <holger@...lied-asynchrony.com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
	Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	John Ogness <john.ogness@...utronix.de>,
	Clark Williams <clrkwllms@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
	linux-rt-users@...r.kernel.org, Joe Damato <jdamato@...tly.com>,
	Martin Karsten <mkarsten@...terloo.ca>,
	Jens Axboe <axboe@...nel.dk>,
	Frederic Weisbecker <frederic@...nel.org>,
	Valentin Schneider <vschneid@...hat.com>
Subject: Re: [PATCH v2] eventpoll: Fix priority inversion problem

On Wed, May 28, 2025 at 08:12:58AM +0200, Nam Cao wrote:
> On Wed, May 28, 2025 at 07:57:26AM +0200, Holger Hoffstätte wrote:
> > I have been running with v2 on 6.15.0 without any issues so far, but just
> > found this in my server's kern.log:
> 
> Thanks for testing!
> 
> > It seems the condition (!n) in __ep_remove is not always true and the WARN_ON triggers.
> > This is the first and only time I've seen this. Currently rebuilding with v3.
> 
> Yeah this means __ep_remove() thinks the item is in epoll's rdllist and
> attempt to remove it, but then couldn't actually find the item in the list.
> 
> __ep_remove() relies on the 'ready' flag, and this flags is quite
> complicated. And as my colleague pointed out off-list, I got memory
> ordering wrong for this flag. Therefore it is likely that you stepped on a
> bug with this flag.
> 
> I got rid of this flag in v3, so hopefully the problem goes away.

Sorry, I have been staring at this but still have no clue why. None of my
stress test can reproduce the issue.

Let me know if testing for v3 goes well.

Best regards,
Nam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ