lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <06b33393-8af5-9faa-6faa-acb5111865f6@huawei.com>
Date:   Mon, 16 May 2022 15:58:37 +0800
From:   Xu Kuohai <xukuohai@...wei.com>
To:     Mark Rutland <mark.rutland@....com>
CC:     <bpf@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
        <linux-kselftest@...r.kernel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Zi Shen Lim <zlim.lnx@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        "David S . Miller" <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, <x86@...nel.org>,
        <hpa@...or.com>, Shuah Khan <shuah@...nel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        Daniel Kiss <daniel.kiss@....com>,
        Steven Price <steven.price@....com>,
        Sudeep Holla <sudeep.holla@....com>,
        Marc Zyngier <maz@...nel.org>,
        Peter Collingbourne <pcc@...gle.com>,
        Mark Brown <broonie@...nel.org>,
        Delyan Kratunov <delyank@...com>,
        Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment
 bpf_arch_text_poke() for arm64

On 5/16/2022 3:18 PM, Mark Rutland wrote:
> On Mon, May 16, 2022 at 02:55:46PM +0800, Xu Kuohai wrote:
>> On 5/13/2022 10:59 PM, Mark Rutland wrote:
>>> On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote:
>>>> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use
>>>> it to replace nop with jump, or replace jump with nop.
>>>>
>>>> Signed-off-by: Xu Kuohai <xukuohai@...wei.com>
>>>> Acked-by: Song Liu <songliubraving@...com>
>>>> ---
>>>>  arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 63 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>>>> index 8ab4035dea27..3f9bdfec54c4 100644
>>>> --- a/arch/arm64/net/bpf_jit_comp.c
>>>> +++ b/arch/arm64/net/bpf_jit_comp.c
>>>> @@ -9,6 +9,7 @@
>>>>  
>>>>  #include <linux/bitfield.h>
>>>>  #include <linux/bpf.h>
>>>> +#include <linux/memory.h>
>>>>  #include <linux/filter.h>
>>>>  #include <linux/printk.h>
>>>>  #include <linux/slab.h>
>>>> @@ -18,6 +19,7 @@
>>>>  #include <asm/cacheflush.h>
>>>>  #include <asm/debug-monitors.h>
>>>>  #include <asm/insn.h>
>>>> +#include <asm/patching.h>
>>>>  #include <asm/set_memory.h>
>>>>  
>>>>  #include "bpf_jit.h"
>>>> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr)
>>>>  {
>>>>  	return vfree(addr);
>>>>  }
>>>> +
>>>> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip,
>>>> +			     void *addr, u32 *insn)
>>>> +{
>>>> +	if (!addr)
>>>> +		*insn = aarch64_insn_gen_nop();
>>>> +	else
>>>> +		*insn = aarch64_insn_gen_branch_imm((unsigned long)ip,
>>>> +						    (unsigned long)addr,
>>>> +						    type);
>>>> +
>>>> +	return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT;
>>>> +}
>>>> +
>>>> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>>>> +		       void *old_addr, void *new_addr)
>>>> +{
>>>> +	int ret;
>>>> +	u32 old_insn;
>>>> +	u32 new_insn;
>>>> +	u32 replaced;
>>>> +	enum aarch64_insn_branch_type branch_type;
>>>> +
>>>> +	if (!is_bpf_text_address((long)ip))
>>>> +		/* Only poking bpf text is supported. Since kernel function
>>>> +		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>>>> +		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
>>>> +		 * called after a failed poke with ftrace. In this case, there
>>>> +		 * is probably something wrong with fentry, so there is nothing
>>>> +		 * we can do here. See register_fentry, unregister_fentry and
>>>> +		 * modify_fentry for details.
>>>> +		 */
>>>> +		return -EINVAL;
>>>
>>> If you rely on ftrace to poke functions, why do you need to patch text
>>> at all? Why does the rest of this function exist?
>>>
>>> I really don't like having another piece of code outside of ftrace
>>> patching the ftrace patch-site; this needs a much better explanation.
>>>
>>
>> Sorry for the incorrect explaination in the comment. I don't think it's
>> reasonable to patch ftrace patch-site without ftrace code either.
>>
>> The patching logic in register_fentry, unregister_fentry and
>> modify_fentry is as follows:
>>
>> if (tr->func.ftrace_managed)
>>         ret = register_ftrace_direct((long)ip, (long)new_addr);
>> else
>>         ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr,
>>                                  true);
>>
>> ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is
>> only used to patch bpf prog and bpf trampoline, which are not managed by
>> ftrace.
> 
> Sorry, I had misunderstood. Thanks for the correction!
> 
> I'll have another look with that in mind.
>>>>> +
>>>> +	if (poke_type == BPF_MOD_CALL)
>>>> +		branch_type = AARCH64_INSN_BRANCH_LINK;
>>>> +	else
>>>> +		branch_type = AARCH64_INSN_BRANCH_NOLINK;
>>>> +
>>>> +	if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0)
>>>> +		return -EFAULT;
>>>> +
>>>> +	if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0)
>>>> +		return -EFAULT;
>>>> +
>>>> +	mutex_lock(&text_mutex);
>>>> +	if (aarch64_insn_read(ip, &replaced)) {
>>>> +		ret = -EFAULT;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	if (replaced != old_insn) {
>>>> +		ret = -EFAULT;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn);
>>>
>>> ... and where does the actual synchronization come from in this case?
>>
>> aarch64_insn_patch_text_nosync() replaces an instruction atomically, so
>> no other CPUs will fetch a half-new and half-old instruction.
>>
>> The scenario here is that there is a chance that another CPU fetches the
>> old instruction after bpf_arch_text_poke() finishes, that is, different
>> CPUs may execute different versions of instructions at the same time.
>>
>> 1. When a new trampoline is attached, it doesn't seem to be an issue for
>> different CPUs to jump to different trampolines temporarily.
>>
>> 2. When an old trampoline is freed, we should wait for all other CPUs to
>> exit the trampoline and make sure the trampoline is no longer reachable,
>> IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu
>> tasks to do this.
> 
> It would be good to have a comment for these points>

will add a comment for this in v4, thanks!

> Thanks,
> Mark.
> .

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ