lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180208175433.p2a3g2q7tctfpk7c@lakrids.cambridge.arm.com>
Date:   Thu, 8 Feb 2018 17:54:33 +0000
From:   Mark Rutland <mark.rutland@....com>
To:     Mark Salter <msalter@...hat.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf: arm_pmu_acpi: Fix armpmu_alloc call from invalid
 context

Hi Mark,

On Thu, Feb 08, 2018 at 12:45:04PM -0500, Mark Salter wrote:
> When booting an arm64 debug kernel with ACPI, I see:
> 
>    BUG: sleeping function called from invalid context at mm/slab.h:420
>    in_atomic(): 0, irqs_disabled(): 128, pid: 12, name: cpuhp/0
>    1 lock held by cpuhp/0/12:
>     #0:  (cpuhp_state-up){+.+.}, at: [<0000000057aa0dae>] cpuhp_thread_fun+0x13c/0x258
>    irq event stamp: 28
>    hardirqs last  enabled at (27): [<000000000b861658>] _raw_spin_unlock_irq+0x38/0x58
>    hardirqs last disabled at (28): [<000000006231cfb1>] cpuhp_thread_fun+0xd0/0x258
>    softirqs last  enabled at (0): [<0000000054d9737a>] copy_process.isra.32.part.33+0x450/0x1480
>    softirqs last disabled at (0): [<          (null)>]           (null)
>    CPU: 0 PID: 12 Comm: cpuhp/0 Not tainted 4.15.0+ #18
>    Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.25 Oct 17 2016
>    Call trace:
>     dump_backtrace+0x0/0x188
>     show_stack+0x24/0x2c
>     dump_stack+0xa4/0xe0
>     ___might_sleep+0x208/0x234
>     __might_sleep+0x58/0x8c
>     kmem_cache_alloc_trace+0x248/0x3e0
>     armpmu_alloc+0x38/0x1a8
>     arm_pmu_acpi_cpu_starting+0x11c/0x15c
>     cpuhp_invoke_callback+0x120/0x100c
>     cpuhp_thread_fun+0xe8/0x258
>     smpboot_thread_fn+0x170/0x268
>     kthread+0x110/0x13c
>     ret_from_fork+0x10/0x18

I have patches to address this:

http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/557838.html
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/acpi-pmu-lockdep

> With commit 7d88eb695a1f ("arm/perf: Convert to hotplug state machine"),
> arm_pmu uses the cpuhotplug framework to initialize the PMU driver when
> using ACPI. However, the arm_pmu_acpi_cpu_starting() callback comes
> before CPUHP_AP_ONLINE is reached which means it runs with interrupts
> diabled and tries to allocate memory with GFP_KERNEL alloc which may
> sleep.
> 
> Move CPUHP_AP_PERF_ARM_ACPI_STARTING to come after CPUHP_AP_ONLINE so
> that the arm_pmu initialization runs with interrupts enabled as it
> does when booting with device tree.
> 
> Fixes: 7d88eb695a1f ("arm/perf: Convert to hotplug state machine")
> Signed-off-by: Mark Salter <msalter@...hat.com>
> ---
>  include/linux/cpuhotplug.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 5172ad0..e07b2da 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -114,7 +114,6 @@ enum cpuhp_state {
>  	CPUHP_AP_ARM_VFP_STARTING,
>  	CPUHP_AP_ARM64_DEBUG_MONITORS_STARTING,
>  	CPUHP_AP_PERF_ARM_HW_BREAKPOINT_STARTING,
> -	CPUHP_AP_PERF_ARM_ACPI_STARTING,
>  	CPUHP_AP_PERF_ARM_STARTING,
>  	CPUHP_AP_ARM_L2X0_STARTING,
>  	CPUHP_AP_ARM_ARCH_TIMER_STARTING,
> @@ -146,6 +145,7 @@ enum cpuhp_state {
>  	CPUHP_AP_SMPBOOT_THREADS,
>  	CPUHP_AP_X86_VDSO_VMA_ONLINE,
>  	CPUHP_AP_IRQ_AFFINITY_ONLINE,
> +	CPUHP_AP_PERF_ARM_ACPI_STARTING,

We need CPUHP_AP_PERF_ARM_ACPI_STARTING to happen before
CPUHP_AP_PERF_ARM_STARTING, and I think this re-ordering prevents us
from correctly resetting the PMU and enabling percpu interrupts, at
least in heterogeneous configurations (e.g. big.LITTLE systems like
Juno).

I'm not sure whether we could safely move both callbacks this late.

Thanks,
Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ