linux-kernel - Re: [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ilqdobl1.fsf@cloudflare.com>
Date:   Tue, 10 May 2022 11:36:59 +0200
From:   Jakub Sitnicki <jakub@...udflare.com>
To:     Xu Kuohai <xukuohai@...wei.com>
Cc:     bpf@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        linux-kselftest@...r.kernel.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Zi Shen Lim <zlim.lnx@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        "David S . Miller" <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        hpa@...or.com, Shuah Khan <shuah@...nel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        Daniel Kiss <daniel.kiss@....com>,
        Steven Price <steven.price@....com>,
        Sudeep Holla <sudeep.holla@....com>,
        Marc Zyngier <maz@...nel.org>,
        Peter Collingbourne <pcc@...gle.com>,
        Mark Brown <broonie@...nel.org>,
        Delyan Kratunov <delyank@...com>,
        Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [PATCH bpf-next v3 5/7] bpf, arm64: Support to poke bpf prog

Thanks for incorporating the attach to BPF progs bits into the series.

I have a couple minor comments. Please see below.

On Sun, Apr 24, 2022 at 11:40 AM -04, Xu Kuohai wrote:
> 1. Set up the bpf prog entry in the same way as fentry to support
>    trampoline. Now bpf prog entry looks like this:
>
>    bti c        // if BTI enabled
>    mov x9, x30  // save lr
>    nop          // to be replaced with jump instruction
>    paciasp      // if PAC enabled
>
> 2. Update bpf_arch_text_poke() to poke bpf prog. If the instruction
>    to be poked is bpf prog's first instruction, skip to the nop
>    instruction in the prog entry.
>
> Signed-off-by: Xu Kuohai <xukuohai@...wei.com>
> ---
>  arch/arm64/net/bpf_jit.h      |  1 +
>  arch/arm64/net/bpf_jit_comp.c | 41 +++++++++++++++++++++++++++--------
>  2 files changed, 33 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
> index 194c95ccc1cf..1c4b0075a3e2 100644
> --- a/arch/arm64/net/bpf_jit.h
> +++ b/arch/arm64/net/bpf_jit.h
> @@ -270,6 +270,7 @@
>  #define A64_BTI_C  A64_HINT(AARCH64_INSN_HINT_BTIC)
>  #define A64_BTI_J  A64_HINT(AARCH64_INSN_HINT_BTIJ)
>  #define A64_BTI_JC A64_HINT(AARCH64_INSN_HINT_BTIJC)
> +#define A64_NOP    A64_HINT(AARCH64_INSN_HINT_NOP)
>  
>  /* DMB */
>  #define A64_DMB_ISH aarch64_insn_gen_dmb(AARCH64_INSN_MB_ISH)
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 3f9bdfec54c4..293bdefc5d0c 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -237,14 +237,23 @@ static bool is_lsi_offset(int offset, int scale)
>  	return true;
>  }
>  
> -/* Tail call offset to jump into */
> -#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) || \
> -	IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
> -#define PROLOGUE_OFFSET 9
> +#if IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)
> +#define BTI_INSNS	1
> +#else
> +#define BTI_INSNS	0
> +#endif
> +
> +#if IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)
> +#define PAC_INSNS	1
>  #else
> -#define PROLOGUE_OFFSET 8
> +#define PAC_INSNS	0
>  #endif

Above can be folded into:

#define BTI_INSNS (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) ? 1 : 0)
#define PAC_INSNS (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) ? 1 : 0)

>  
> +/* Tail call offset to jump into */
> +#define PROLOGUE_OFFSET	(BTI_INSNS + 2 + PAC_INSNS + 8)
> +/* Offset of nop instruction in bpf prog entry to be poked */
> +#define POKE_OFFSET	(BTI_INSNS + 1)
> +
>  static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>  {
>  	const struct bpf_prog *prog = ctx->prog;
> @@ -281,12 +290,15 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
>  	 *
>  	 */
>  
> +	if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> +		emit(A64_BTI_C, ctx);

I'm no arm64 expert, but this looks like a fix for BTI.

Currently we never emit BTI because ARM64_BTI_KERNEL depends on
ARM64_PTR_AUTH_KERNEL, while BTI must be the first instruction for the
jump target [1]. Am I following correctly?

[1] https://lwn.net/Articles/804982/

> +
> +	emit(A64_MOV(1, A64_R(9), A64_LR), ctx);
> +	emit(A64_NOP, ctx);
> +
>  	/* Sign lr */
>  	if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL))
>  		emit(A64_PACIASP, ctx);
> -	/* BTI landing pad */
> -	else if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> -		emit(A64_BTI_C, ctx);
>  
>  	/* Save FP and LR registers to stay align with ARM64 AAPCS */
>  	emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
> @@ -1552,9 +1564,11 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>  	u32 old_insn;
>  	u32 new_insn;
>  	u32 replaced;
> +	unsigned long offset = ~0UL;
>  	enum aarch64_insn_branch_type branch_type;
> +	char namebuf[KSYM_NAME_LEN];
>  
> -	if (!is_bpf_text_address((long)ip))
> +	if (!__bpf_address_lookup((unsigned long)ip, NULL, &offset, namebuf))
>  		/* Only poking bpf text is supported. Since kernel function
>  		 * entry is set up by ftrace, we reply on ftrace to poke kernel
>  		 * functions. For kernel funcitons, bpf_arch_text_poke() is only
> @@ -1565,6 +1579,15 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
>  		 */
>  		return -EINVAL;
>  
> +	/* bpf entry */
> +	if (offset == 0UL)
> +		/* skip to the nop instruction in bpf prog entry:
> +		 * bti c	// if BTI enabled
> +		 * mov x9, x30
> +		 * nop
> +		 */
> +		ip = (u32 *)ip + POKE_OFFSET;

This is very much personal preference, however, I find the use pointer
arithmetic too clever here. Would go for a more verbose:

        offset = POKE_OFFSET * AARCH64_INSN_SIZE;          
        ip = (void *)((unsigned long)ip + offset);

> +
>  	if (poke_type == BPF_MOD_CALL)
>  		branch_type = AARCH64_INSN_BRANCH_LINK;
>  	else

I think it'd make more sense to merge this patch with patch 4 (the
preceding one).

Initial implementation of of bpf_arch_text_poke() from patch 4 is not
fully functional, as it will always fail for bpf_arch_text_poke(ip,
BPF_MOD_CALL, ...) calls. At least, I find it a bit confusing.

Otherwise than that:

Reviewed-by: Jakub Sitnicki <jakub@...udflare.com>