[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z0nX2olCQtSciY7-@jlelli-thinkpadt14gen4.remote.csb>
Date: Fri, 29 Nov 2024 16:03:54 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org,
André Almeida <andrealmeid@...lia.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>
Subject: Re: [RFC PATCH v3 0/9] futex: Add support task local hash maps.
Hi Sebastian,
On 15/11/24 17:58, Sebastian Andrzej Siewior wrote:
> Hi,
>
> this is a follow up on
> https://lore.kernel.org/ZwVOMgBMxrw7BU9A@jlelli-thinkpadt14gen4.remote.csb
>
> and adds support for task local futex_hash_bucket. It can be created via
> prctl().
>
> This version supports resize at runtime. This fun part is limited is to
> FUTEX_LOCK_PI which means any other waiter will break.
>
> I posted performance numbers of "perf bench futex hash"
> https://lore.kernel.org/all/20241101110810.R3AnEqdu@linutronix.de/
Performance looks generally good on our side as well. However, while
testing the set manually with a debug enabled config (attached) I hit
the following BUG (decoded) while booting the machine.
---
BUG: unable to handle page fault for address: ffffad7a4e08f480
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: Oops: 0002 [#1] PREEMPT_RT SMP NOPTI
Hardware name: Dell Inc. PowerEdge R740/04FC42, BIOS 2.10.2 02/24/2021
RIP: 0010:futex_hash_priv_put (./arch/x86/include/asm/atomic.h:79 ./include/linux/atomic/atomic-arch-fallback.h:2378 ./include/linux/atomic/atomic-instrumented.h:1458 ./include/linux/rcuref.h:87 ./include/linux/rcuref.h:150 kernel/futex/core.c:164)
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 fd bf 01 00 00 00 53 e8 cc 22 f1 ff e8 47 70 cb 00 85 c0 75 2a <f0> 83 45 00 ff 0f 98 c3 78 7e bf 01 00 00 00 e8 ff 00 f1 ff 65 8b
All code
========
0: 90 nop
1: 90 nop
2: 90 nop
3: 90 nop
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 90 nop
a: 90 nop
b: 90 nop
c: 90 nop
d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
12: 55 push %rbp
13: 48 89 fd mov %rdi,%rbp
16: bf 01 00 00 00 mov $0x1,%edi
1b: 53 push %rbx
1c: e8 cc 22 f1 ff callq 0xfffffffffff122ed
21: e8 47 70 cb 00 callq 0xcb706d
26: 85 c0 test %eax,%eax
28: 75 2a jne 0x54
2a:* f0 83 45 00 ff lock addl $0xffffffff,0x0(%rbp) <-- trapping instruction
2f: 0f 98 c3 sets %bl
32: 78 7e js 0xb2
34: bf 01 00 00 00 mov $0x1,%edi
39: e8 ff 00 f1 ff callq 0xfffffffffff1013d
3e: 65 gs
3f: 8b .byte 0x8b
Code starting with the faulting instruction
===========================================
0: f0 83 45 00 ff lock addl $0xffffffff,0x0(%rbp)
5: 0f 98 c3 sets %bl
8: 78 7e js 0x88
a: bf 01 00 00 00 mov $0x1,%edi
f: e8 ff 00 f1 ff callq 0xfffffffffff10113
14: 65 gs
15: 8b .byte 0x8b
RSP: 0018:ffffae3a4fab3dc0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffffa181e4ca RDI: ffffffffa18b55f6
RBP: ffffad7a4e08f480 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffae3a4e08f340
R13: 0000000000000081 R14: 000000007fffffff R15: 0000000000000000
FS: 00007fde38938b40(0000) GS:ffff8f6e96e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffad7a4e08f480 CR3: 00000019e1c98004 CR4: 00000000007706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
? page_fault_oops (arch/x86/mm/fault.c:715)
? exc_page_fault (arch/x86/mm/fault.c:1479 arch/x86/mm/fault.c:1539)
? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
? futex_hash_priv_put (./arch/x86/include/asm/atomic.h:79 ./include/linux/atomic/atomic-arch-fallback.h:2378 ./include/linux/atomic/atomic-instrumented.h:1458 ./include/linux/rcuref.h:87 ./include/linux/rcuref.h:150 kernel/futex/core.c:164)
futex_wake (kernel/futex/waitwake.c:180)
do_futex (kernel/futex/syscalls.c:131)
__x64_sys_futex (kernel/futex/syscalls.c:179 kernel/futex/syscalls.c:160 kernel/futex/syscalls.c:160)
? __lock_release.isra.0 (kernel/locking/lockdep.c:339 kernel/locking/lockdep.c:352 kernel/locking/lockdep.c:5507)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
? exc_page_fault (arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7fde39642fc0
Code: ff d3 31 f6 4c 89 e7 e8 ae 73 ff ff 45 31 d2 ba ff ff ff 7f be 81 00 00 00 c7 45 00 02 00 00 00 48 89 ef b8 ca 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 86 15 ff ff ff 83 c0 16 83 e0 f7 0f 85 66 ff
All code
========
0: ff d3 callq *%rbx
2: 31 f6 xor %esi,%esi
4: 4c 89 e7 mov %r12,%rdi
7: e8 ae 73 ff ff callq 0xffffffffffff73ba
c: 45 31 d2 xor %r10d,%r10d
f: ba ff ff ff 7f mov $0x7fffffff,%edx
14: be 81 00 00 00 mov $0x81,%esi
19: c7 45 00 02 00 00 00 movl $0x2,0x0(%rbp)
20: 48 89 ef mov %rbp,%rdi
23: b8 ca 00 00 00 mov $0xca,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 0f 86 15 ff ff ff jbe 0xffffffffffffff4b
36: 83 c0 16 add $0x16,%eax
39: 83 e0 f7 and $0xfffffff7,%eax
3c: 0f .byte 0xf
3d: 85 66 ff test %esp,-0x1(%rsi)
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 0f 86 15 ff ff ff jbe 0xffffffffffffff21
c: 83 c0 16 add $0x16,%eax
f: 83 e0 f7 and $0xfffffff7,%eax
12: 0f .byte 0xf
13: 85 66 ff test %esp,-0x1(%rsi)
RSP: 002b:00007ffcbdf96830 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 00007fde3892ce10 RCX: 00007fde39642fc0
RDX: 000000007fffffff RSI: 0000000000000081 RDI: 00007fde38937250
RBP: 00007fde38937250 R08: 00007fde397afae0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcbdf96848
R13: 00007ffcbdf96910 R14: 00007fde38937250 R15: 00007fde3892ce10
</TASK>
Modules linked in: fuse
CR2: ffffad7a4e08f480
---[ end trace 0000000000000000 ]---
Cannot seems to be able to reproduce this with the non-debug/production
config though.
Best,
Juri
Powered by blists - more mailing lists