linux-kernel - Re: [PATCH 05/11] smp: Enable preemption early in smp_call_function_many

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <e4ce3ce4-7353-4bbe-8b74-fceab8a62093@bytedance.com>
Date: Fri, 6 Feb 2026 16:43:48 +0800
From: "Chuyi Zhou" <zhouchuyi@...edance.com>
To: "Peter Zijlstra" <peterz@...radead.org>
Cc: <tglx@...utronix.de>, <mingo@...hat.com>, <luto@...nel.org>, 
	<paulmck@...nel.org>, <muchun.song@...ux.dev>, <bp@...en8.de>, 
	<dave.hansen@...ux.intel.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 05/11] smp: Enable preemption early in smp_call_function_many_cond

Hi Peter,

在 2026/2/5 22:59, Peter Zijlstra 写道:
> On Thu, Feb 05, 2026 at 10:29:51PM +0800, Chuyi Zhou wrote:
>> Hi Peter,
>>
>> 在 2026/2/5 18:57, Peter Zijlstra 写道:
>>> On Thu, Feb 05, 2026 at 10:52:36AM +0100, Peter Zijlstra wrote:
>>>> On Tue, Feb 03, 2026 at 07:23:55PM +0800, Chuyi Zhou wrote:
>>>>
>>>>> +	/*
>>>>> +	 * Prevent the current CPU from going offline.
>>>>> +	 * Being migrated to another CPU and calling csd_lock_wait() may cause
>>>>> +	 * UAF due to smpcfd_dead_cpu() during the current CPU offline process.
>>>>> +	 */
>>>>> +	migrate_disable();
>>>>
>>>> This is horrible crap. migrate_disable() is *NOT* supposed to be used to
>>>> serialize cpu hotplug.
>>>
>>> This was too complicated or something?
>>>
>>
>> Now most callers of smp_call*() explicitly use preempt_disable(). IIUC,
>> if we want to use cpus_read_lock(), we first need to clean up all these
>> preempt_disable() calls.
>>
>> Maybe a stupid question: Why can't migrate_disable prevent CPU removal?
> 
> It can, but migrate_disable() is horrible, it should not be used if at
> all possible.

As you pointed out, using cpus_read_lock() is the simplest approach, and 
indeed, that was the first solution we considered.

However, 99% of callers have preemption disabled, some of them even 
invoking it within spin_locks (for example, we might trigger a TLB flush 
while holding pte spinlocks).

It's difficult for us to eliminate all these preempt_disable(), 
especially for callers that disable preemption for other purposes, 
making the use of cpus_read_lock almost impossible.

In our production environment, we observed that the overhead of 
csd_lock_wait can be as high as several milliseconds, and in extreme 
cases, even exceed 10ms+. Generally speaking, the time spent on 
csd_lock_wait far exceeds the overhead of sending the IPI.

Disabling preemption for the entire duration would obviously affect the 
preemption latency of high-priority tasks, which is unacceptable. This 
optimization primarily targets PREEMPT, although PREEMPT_RT can also 
benefit from it.

Compared to the cost of disabling preemption entirely, maybe using 
migrate_disable() here seems to be an acceptable trade-off.