linux-kernel - Re: [PATCH] coresight: etm3x: convert struct etm_drvdata's spinlock to raw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <06877064-51df-d162-0da2-aaa710e0fefe@arm.com>
Date:   Wed, 12 Jul 2023 14:22:26 +0100
From:   Suzuki K Poulose <suzuki.poulose@....com>
To:     James Clark <james.clark@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        quanyang.wang@...driver.com
Cc:     Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Tingwei Zhang <tingwei@...eaurora.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        Kim Phillips <kim.phillips@....com>,
        Sebastian Siewior <bigeasy@...utronix.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-rt-users@...r.kernel.org, coresight@...ts.linaro.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] coresight: etm3x: convert struct etm_drvdata's spinlock
 to raw_spinlock

On 11/07/2023 16:45, James Clark wrote:
> 
> 
> On 11/07/2023 15:05, Greg Kroah-Hartman wrote:
>> On Tue, Jul 11, 2023 at 03:05:36PM +0800, quanyang.wang@...driver.com wrote:
>>> From: Quanyang Wang <quanyang.wang@...driver.com>
>>>
>>> For PREEMPT_RT kernel, spinlock_t locks become sleepable. The functions
>>> etm_dying_cpu and etm_starting_cpu which call spin_lock/unlock run in
>>> an irq-disabled context, this will trigger the following calltrace:
>>>
>>>      BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
>>>      in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 25, name: migration/1
>>>      preempt_count: 1, expected: 0
>>>      RCU nest depth: 0, expected: 0
>>>      1 lock held by migration/1/25:
>>>       #0: 82a7587c (&drvdata->spinlock){....}-{2:2}, at: etm_dying_cpu+0x28/0x54
>>>      Preemption disabled at:
>>>      [<801ec760>] cpu_stopper_thread+0x94/0x120
>>>      CPU: 1 PID: 25 Comm: migration/1 Not tainted 6.1.35-rt10-yocto-preempt-rt #30
>>>      Hardware name: Xilinx Zynq Platform
>>>      Stopper: multi_cpu_stop+0x0/0x174 <- __stop_cpus.constprop.0+0x48/0x88
>>>       unwind_backtrace from show_stack+0x18/0x1c
>>>       show_stack from dump_stack_lvl+0x58/0x70
>>>       dump_stack_lvl from __might_resched+0x14c/0x1c0
>>>       __might_resched from rt_spin_lock+0x4c/0x84
>>>       rt_spin_lock from etm_dying_cpu+0x28/0x54
>>>       etm_dying_cpu from cpuhp_invoke_callback+0x140/0x33c
>>>       cpuhp_invoke_callback from __cpuhp_invoke_callback_range+0xa4/0x104
>>>       __cpuhp_invoke_callback_range from take_cpu_down+0x7c/0xa8
>>>       take_cpu_down from multi_cpu_stop+0x15c/0x174
>>>       multi_cpu_stop from cpu_stopper_thread+0x9c/0x120
>>>       cpu_stopper_thread from smpboot_thread_fn+0x31c/0x360
>>>       smpboot_thread_fn from kthread+0x100/0x124
>>>       kthread from ret_from_fork+0x14/0x2c
>>>
>>> Convert struct etm_drvdata's spinlock to raw_spinlock to fix it.
>>
>> wait, why will a raw_spinlock fix this?  Why not fix the root problem
>> here, that of calling these locks inproperly in irq context?
>>
>> How is changing to a raw_spinlock going to fix the above splat?
>>
>> thanks,
>>
>> greg k-h
>>
> 
> If it's just etm_starting_cpu() and etm_dying_cpu() as mentioned in the
> commit message then can those spinlocks be removed?
> 
> Surely there can't be any concurrent access to the per-cpu data when the
> hotplug callbacks are called?

Accessing the per-cpu data is not a problem. The spinlocks are there to
protect the accesses to drvdata->mode. etm_starting_cpu() would try to
enable the etm (i.e., start the tracing) if the mode is not DISABLED.
Especially for SYSFS mode, this could be controlled from a different
CPU, affecting the mode. I think we may still be able to avoid this
lock, by allowing the modifications to the mode performed via
enable_hw/disable_hw on the CPU. That way, there cannot be concurrent
modifications to the mode for a given ETM bound to the CPU.

Suzuki