lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1712301916020.1899@nanos>
Date:   Sat, 30 Dec 2017 19:20:04 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Dominik Brodowski <linux@...inikbrodowski.net>
cc:     Andy Lutomirski <luto@...nel.org>, dave.hansen@...ux.intel.com,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: x86/pti: smp_processor_id() called while preemptible in
 resume-from-sleep

On Sat, 30 Dec 2017, Dominik Brodowski wrote:

> On Sat, Dec 30, 2017 at 04:03:07PM +0100, Thomas Gleixner wrote:
> > On Sat, 30 Dec 2017, Dominik Brodowski wrote:
> > > resume-from-sleep (mem/S3) on v4.15-rc5-149-g5aa90a845892 triggers the
> > > following bug. If I boot with "pti=off", the kernel does not show this
> > > issue, and neither did kernels before pti was merged:
> > > 
> > > [   39.951703] ACPI: Low-level resume complete
> > > [   39.951832] ACPI: EC: EC started
> > > [   39.951840] PM: Restoring platform NVS memory
> > > [   39.954648] Enabling non-boot CPUs ...
> > > [   39.954792] x86: Booting SMP configuration:
> > > [   39.954800] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > > [   39.954834] BUG: using smp_processor_id() in preemptible [00000000] code: sh/465
> > > [   39.954841] caller is native_cpu_up+0x2f0/0xa30
> > 
> > I can't reproduce at the moment and I can't find a possible reason for this
> > by code inspection.
> 
> Thanks for taking a look at it!
> 
> > Can you please provide your .config file
> 
> See attached.
> 
> > and perhaps decode the two offending call sites with
> > 
> >   scripts/faddr2line vmlinux native_cpu_up+0x2f0/0xa30 native_cpu_up+0x447/0xa30
> 
> native_cpu_up+0x2f0/0xa30:
> invalidate_user_asid at arch/x86/include/asm/tlbflush.h:343

Ah, that makes sense. Missed that in the maze.

What makes less sense is that tlbflush itself. I'm surely missing something
subtle, but from a first look that tlbflush is pointless.

>  (inlined by) __native_flush_tlb at arch/x86/include/asm/tlbflush.h:351
>  (inlined by) smpboot_setup_warm_reset_vector at arch/x86/kernel/smpboot.c:129
>  (inlined by) do_boot_cpu at arch/x86/kernel/smpboot.c:950
>  (inlined by) native_cpu_up at arch/x86/kernel/smpboot.c:1070
> 
> native_cpu_up+0x447/0xa30:
> kern_pcid at arch/x86/include/asm/tlbflush.h:105
>  (inlined by) invalidate_user_asid at arch/x86/include/asm/tlbflush.h:342
>  (inlined by) __native_flush_tlb at arch/x86/include/asm/tlbflush.h:351
>  (inlined by) smpboot_restore_warm_reset_vector at arch/x86/kernel/smpboot.c:146

This one even more so as the stale comment suggests, that there was some
page table fiddling at some point in the past.

>  (inlined by) do_boot_cpu at arch/x86/kernel/smpboot.c:1022
>  (inlined by) native_cpu_up at arch/x86/kernel/smpboot.c:1070

Let me think about it and do some archaeological research.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ