[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3475f3f1-4109-b6ac-6ea6-dadcdec8db1f@applied-asynchrony.com>
Date: Wed, 28 May 2025 07:57:26 +0200
From: Holger Hoffstätte <holger@...lied-asynchrony.com>
To: Nam Cao <namcao@...utronix.de>, Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
John Ogness <john.ogness@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev, linux-rt-users@...r.kernel.org,
Joe Damato <jdamato@...tly.com>, Martin Karsten <mkarsten@...terloo.ca>,
Jens Axboe <axboe@...nel.dk>
Cc: Frederic Weisbecker <frederic@...nel.org>,
Valentin Schneider <vschneid@...hat.com>
Subject: Re: [PATCH v2] eventpoll: Fix priority inversion problem
Hello,
I have been running with v2 on 6.15.0 without any issues so far, but just
found this in my server's kern.log:
May 27 22:02:12 tux kernel: ------------[ cut here ]------------
May 27 22:02:12 tux kernel: WARNING: CPU: 2 PID: 3011 at fs/eventpoll.c:850 __ep_remove+0x137/0x250
May 27 22:02:12 tux kernel: Modules linked in: loop nfsd auth_rpcgss oid_registry lockd grace sunrpc sch_fq_codel btrfs nct6775 blake2b_generic nct6775_core xor lzo_compress hwmon_vid i915 raid6_pq zstd_compress x86_pkg_temp_thermal drivetemp lzo_decompress coretemp i2c_algo_bit sha512_ssse3 drm_buddy sha512_generic intel_gtt sha256_ssse3 drm_client_lib sha256_generic libsha256 sha1_ssse3 drm_display_helper sha1_generic wmi_bmof drm_kms_helper aesni_intel mq_deadline ttm usbhid gf128mul libaes drm crypto_simd cryptd i2c_i801 video atlantic i2c_smbus drm_panel_orientation_quirks zlib_deflate i2c_core wmi backlight
May 27 22:02:12 tux kernel: CPU: 2 UID: 996 PID: 3011 Comm: chrony_exporter Not tainted 6.15.0 #1 PREEMPTLAZY
May 27 22:02:12 tux kernel: Hardware name: System manufacturer System Product Name/P8Z68-V LX, BIOS 4105 07/01/2013
May 27 22:02:12 tux kernel: RIP: 0010:__ep_remove+0x137/0x250
May 27 22:02:12 tux kernel: Code: 48 89 c7 48 85 c0 74 22 48 8d 54 24 08 48 89 fe e8 3e 1c 24 00 48 89 df e8 56 1c 24 00 48 89 c7 4c 39 e8 74 07 48 85 ff 75 de <0f> 0b 4d 85 f6 74 10 48 8b 7c 24 08 48 89 da 4c 89 f6 e8 12 1c 24
May 27 22:02:12 tux kernel: RSP: 0018:ffffc90002a4be40 EFLAGS: 00010246
May 27 22:02:12 tux kernel: RAX: 0000000000000000 RBX: ffff888104361710 RCX: ffff8881100f2d00
May 27 22:02:12 tux kernel: RDX: 0000000000000000 RSI: ffff888100e04800 RDI: 0000000000000000
May 27 22:02:12 tux kernel: RBP: ffff888367929080 R08: ffff888104361718 R09: ffffffff81575c7b
May 27 22:02:12 tux kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881043616c0
May 27 22:02:12 tux kernel: R13: ffff8883679290a0 R14: 0000000000000000 R15: 0000000000000002
May 27 22:02:12 tux kernel: FS: 00007fee87df5740(0000) GS:ffff88887c9c4000(0000) knlGS:0000000000000000
May 27 22:02:12 tux kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 27 22:02:12 tux kernel: CR2: 000000c002a33000 CR3: 00000001076f1003 CR4: 00000000000606f0
May 27 22:02:12 tux kernel: Call Trace:
May 27 22:02:12 tux kernel: <TASK>
May 27 22:02:12 tux kernel: do_epoll_ctl+0x6ee/0xcf0
May 27 22:02:12 tux kernel: ? kmem_cache_free+0x2c5/0x3b0
May 27 22:02:12 tux kernel: __x64_sys_epoll_ctl+0x53/0x70
May 27 22:02:12 tux kernel: do_syscall_64+0x47/0x100
May 27 22:02:12 tux kernel: entry_SYSCALL_64_after_hwframe+0x4b/0x53
May 27 22:02:12 tux kernel: RIP: 0033:0x55a289d4952e
May 27 22:02:12 tux kernel: Code: 24 28 44 8b 44 24 2c e9 70 ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
May 27 22:02:12 tux kernel: RSP: 002b:000000c0000584d0 EFLAGS: 00000246 ORIG_RAX: 00000000000000e9
May 27 22:02:12 tux kernel: RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 000055a289d4952e
May 27 22:02:12 tux kernel: RDX: 0000000000000008 RSI: 0000000000000002 RDI: 0000000000000004
May 27 22:02:12 tux kernel: RBP: 000000c000058528 R08: 0000000000000000 R09: 0000000000000000
May 27 22:02:12 tux kernel: R10: 000000c000058514 R11: 0000000000000246 R12: 000000c000058578
May 27 22:02:12 tux kernel: R13: 000000c00015e000 R14: 000000c000005a40 R15: 0000000000000000
May 27 22:02:12 tux kernel: </TASK>
May 27 22:02:12 tux kernel: ---[ end trace 0000000000000000 ]---
It seems the condition (!n) in __ep_remove is not always true and the WARN_ON triggers.
This is the first and only time I've seen this. Currently rebuilding with v3.
cheers
Holger
Powered by blists - more mailing lists