lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6fee0d55-30a7-4811-9d82-f9613f857a5e@roeck-us.net>
Date: Fri, 15 Mar 2024 10:02:15 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Linus Torvalds <torvalds@...uxfoundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, LKML
 <linux-kernel@...r.kernel.org>, x86@...nel.org,
 Uros Bizjak <ubizjak@...il.com>, linux-sparse@...r.kernel.org,
 lkp@...el.com, oe-kbuild-all@...ts.linux.dev
Subject: Re: [patch 5/9] x86: Cure per CPU madness on UP

On 3/15/24 09:42, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@...ck-us.net> wrote:
>>
>> [    3.291087] RIP: 0010:rapl_cpu_online+0xf2/0x110
>> [    3.291087] Code: 05 ff 8e 07 03 40 42 0f 00 48 89 43 60 e8 56 5f 12 00 8b 15 b4 84 61 02 48 8b 05 01 8f 07 03 48 c7 83 90 00 00 00 e0 84 80 b6 <48> 89 9c d0 38 01 00 00 e9 2b ff ff ff b8 f4 ff ff ff e9 47 ff ff
> 
> The code is
> 
>    mov    %rax,0x60(%rbx)
>    call   0x125f5f
>    mov    0x26184b4(%rip),%edx
>    mov    0x3078f01(%rip),%rax
>    movq   $0xffffffffb68084e0,0x90(%rbx)
>    mov    %rbx,0x138(%rax,%rdx,8)                <-- trapping instruction
>    jmp    <backwards>
> 
> with %rdx being some index having the value 0xffffffed (-19).
> 
> That's ENODEV.
> 
> Without line numbers (if you have debug info for that kernel, it's
> good to run "scripts/decode_stacktrace.sh" on stack traces) it's hard

Sorry, I should have done that.

> to really know what's up, but I strongly suspect that it's this:
> 
>          rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;

Correct:

[    2.632164] RIP: 0010:rapl_cpu_online (arch/x86/events/rapl.c:581)

which does point to that line. Here is a complete decoded backtrace:

[    2.632164] Call Trace:
[    2.632164]  <TASK>
[    2.632164] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[    2.632164] ? page_fault_oops (arch/x86/mm/fault.c:713)
[    2.632164] ? search_exception_tables (kernel/extable.c:59)
[    2.632164] ? fixup_exception (arch/x86/mm/extable.c:328)
[    2.632164] ? exc_page_fault (arch/x86/mm/fault.c:1503 arch/x86/mm/fault.c:1563)
[    2.632164] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[    2.632164] ? __pfx_rapl_cpu_online (arch/x86/events/rapl.c:566)
[    2.632164] ? rapl_cpu_online (arch/x86/events/rapl.c:581)
[    2.632164] cpuhp_invoke_callback.constprop.0 (kernel/cpu.c:195)
[    2.632164] __cpuhp_setup_state_cpuslocked (kernel/cpu.c:2541)
[    2.632164] ? __pfx_rapl_cpu_online (arch/x86/events/rapl.c:566)
[    2.632164] rapl_pmu_init (./include/linux/cpuhotplug.h:274 arch/x86/events/rapl.c:843)
[    2.632164] ? __pfx_rapl_pmu_init (arch/x86/events/rapl.c:816)
[    2.632164] do_one_initcall (init/main.c:1241)
[    2.632164] kernel_init_freeable (init/main.c:1302 init/main.c:1319 init/main.c:1338 init/main.c:1551)
[    2.632164] ? __pfx_kernel_init (init/main.c:1432)
[    2.632164] kernel_init (init/main.c:1442)
[    2.632164] ret_from_fork (arch/x86/kernel/process.c:153)
[    2.632164] ? __pfx_kernel_init (init/main.c:1432)
[    2.632164] ret_from_fork_asm (arch/x86/entry/entry_64.S:256)

Guenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ