linux-kernel - Re: [PATCH RFC] powerpc: Panic on jump label code patching failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b4cc6d5-5f5b-4b39-8fdf-ac02c94cd5e2@csgroup.eu>
Date: Sat, 6 Sep 2025 08:42:07 +0200
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: "Ritesh Harjani (IBM)" <riteshh@...ux.ibm.com>,
 Andrew Donnellan <ajd@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org,
 linux-kernel@...r.kernel.org, maddy@...ux.ibm.com, mpe@...erman.id.au,
 peterz@...radead.org, jpoimboe@...nel.org, jbaron@...mai.com
Cc: npiggin@...il.com, rostedt@...dmis.org, ardb@...nel.org,
 Erhard Furtner <erhard_f@...lbox.org>
Subject: Re: [PATCH RFC] powerpc: Panic on jump label code patching failure



Le 06/09/2025 à 05:52, Ritesh Harjani a écrit :
> [Vous ne recevez pas souvent de courriers de riteshh@...ux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> Andrew Donnellan <ajd@...ux.ibm.com> writes:
> 
>> If patch_branch() or patch_instruction() fails while updating a jump
>> label, we presently fail silently, leading to unpredictable behaviour
>> later on.
>>
>> Change arch_jump_label_transform() to panic on a code patching failure,
>> matching the existing behaviour of arch_static_call_transform().
>>
>> Reported-by: Erhard Furtner <erhard_f@...lbox.org>
>> Signed-off-by: Andrew Donnellan <ajd@...ux.ibm.com>
>>
>> ---
>>
>> Ran into this while debugging an issue that Erhard reported to me about my
>> PAGE_TABLE_CHECK series on a G4, where updating a static key failed
>> silently, but only for one call site, leading to an incorrect reference
>> count later on. This looks to be due to the issue fixed in [0]. A loud
>> failure would have saved us all considerable debugging time.
>>
>> Should I change the return type of arch_jump_label_transform() and handle
>> this in an arch-independent way? Are there other users of code patching
>> in powerpc that ought to be hardened?
>>
>> Or is this excessive?
>>
>> [0] https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4b5e6eb281d7b1ea77619bee17095f905a125168.1757003584.git.christophe.leroy@csgroup.eu/
>> ---
>>   arch/powerpc/kernel/jump_label.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/jump_label.c b/arch/powerpc/kernel/jump_label.c
>> index 2659e1ac8604..80d41ed7ac50 100644
>> --- a/arch/powerpc/kernel/jump_label.c
>> +++ b/arch/powerpc/kernel/jump_label.c
>> @@ -12,9 +12,14 @@ void arch_jump_label_transform(struct jump_entry *entry,
>>                               enum jump_label_type type)
>>   {
>>        u32 *addr = (u32 *)jump_entry_code(entry);
>> +     int err;
>>
>>        if (type == JUMP_LABEL_JMP)
>> -             patch_branch(addr, jump_entry_target(entry), 0);
>> +             err = patch_branch(addr, jump_entry_target(entry), 0);
>>        else
>> -             patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>> +             err = patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>> +
>> +     if (err)
>> +             panic("%s: patching failed, err %d, type %d, addr %pS, target %pS\n",
>> +                   __func__, err, type, addr, (void *)jump_entry_target(entry));
>>   }
> 
> arch_jump_label_transform() is mainly getting called from
> __jump_level_update() and it's used for enabling or updating static keys / branch.
> 
> But static keys can also be used by drivers / module subsystem whose
> initialization happens late. Although I understand that if the above
> fails, it might fail much before, from the arch setup code itself, but
> panic() still feels like a big hammer.

But not being able to patch the kernel as required means that you get a 
kernel behaving differently from what is expected.

Imagine a kernel running on a board that is controlling a saw. There is 
a patch_instruction() to activate the safety feature which detects when 
your hands are too close to the blade. Do you want the kernel to 
continue running seamlessly when that patch_instruction() fails ? I'm 
sure you don't !

> 
> Would pr_err() print with WARN_ON_ONCE(1) would suffice in case of an
> err?

No, that's not enough, you can't rely on a kernel that will no behave as 
expected.

> 
> Also you said you ran into a problem at just one call site where above
> was silently failing. With the above change are you able to hit the
> panic() now? Because from what I see in patch_instruction(), it mainly
> will boil down to calling __patch_mem() which always returns 0.

As far as I can see, __patch_mem() returns -EPERM when 
__put_kernel_nofault() fails:

static int __patch_mem(void *exec_addr, unsigned long val, void 
*patch_addr, bool is_dword)
{
	if (!IS_ENABLED(CONFIG_PPC64) || likely(!is_dword)) {
		/* For big endian correctness: plain address would use the wrong half */
		u32 val32 = val;

		__put_kernel_nofault(patch_addr, &val32, u32, failed);
	} else {
		__put_kernel_nofault(patch_addr, &val, u64, failed);
	}

	asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (patch_addr),
							    "r" (exec_addr));

	return 0;

failed:
	mb();  /* sync */
	return -EPERM;
}


> Although there are other places where there can be an error returned,
> so I was wondering if that is what you were hitting or something else?

Andrew was hitting the -EPERM because the memory area was read-only.

Christophe