[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67d49ba7-876c-4cb6-a34d-772e4fd331a0@linux.ibm.com>
Date: Fri, 23 Jan 2026 18:15:50 +0530
From: Hari Bathini <hbathini@...ux.ibm.com>
To: adubey@...ux.ibm.com, bpf@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: sachinpb@...ux.ibm.com, venkat88@...ux.ibm.com, andrii@...nel.org,
eddyz87@...il.com, mykolal@...com, ast@...nel.org,
daniel@...earbox.net, martin.lau@...ux.dev, song@...nel.org,
yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org,
christophe.leroy@...roup.eu, naveen@...nel.org, maddy@...ux.ibm.com,
mpe@...erman.id.au, npiggin@...il.com, memxor@...il.com,
iii@...ux.ibm.com, shuah@...nel.org
Subject: Re: [PATCH v4 1/6] powerpc64/bpf: Moving tail_call_cnt to bottom of
frame
On 23/01/26 2:48 am, adubey@...ux.ibm.com wrote:
> From: Abhishek Dubey <adubey@...ux.ibm.com>
>
> In the conventional stack frame, the position of tail_call_cnt
> is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
> offset of tail_call_cnt in the trampoline frame is after the
> stack alignment padding. BPF JIT logic could become complex
> when dealing with frame-sensitive offset calculation of
> tail_call_cnt. Having the same offset in both frames is the
> desired objective.
>
> The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
> Introducing it leads to under-utilization of extra memory meant
> only for the offset alignment of tail_call_cnt.
> Another challenge is the variable alignment padding sitting at
> the bottom of the trampoline frame, which requires additional
> handling to compute tail_call_cnt offset.
>
> This patch addresses the above issues by moving tail_call_cnt
> to the bottom of the stack frame at offset 0 for both types
> of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
> in trampoline frame, and a common offset computation for
> tail_call_cnt serves both frames.
>
> The changes in this patch are required by the second patch in the
> series, where the 'reference to tail_call_info' of the main frame
> is copied into the trampoline frame from the previous frame.
The changelog needs to be simplified. Something like below:
To support tailcalls in subprogs, tail_call_cnt needs to be on the BPF
trampoline stack frame. In a regular BPF program or subprog stack
frame, the position of tail_call_cnt is after the NVR save area
(BPF_PPC_STACK_SAVE). To avoid complex logic in deducing offset for
tail_call_cnt, it has to be kept at the same offset on the trampoline
frame as well. But doing that wastes nearly all of BPF_PPC_STACK_SAVE
bytes on the BPF trampoline stack frame as the NVR save area is not
the same for BPF trampoline and regular BPF programs. Address this by
moving tail_call_cnt to the bottom of the frame.
This change avoids the need to account for BPF_PPC_STACK_SAVE bytes
in the BPF trampoline stack frame when support for tailcalls in BPF
subprogs is added later. Also, this change makes offset calculation
of tail_call_cnt field simpler all across.
>
> Signed-off-by: Abhishek Dubey <adubey@...ux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 1 +
> arch/powerpc/net/bpf_jit_comp.c | 15 ++++++++++++---
> arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
> 3 files changed, 33 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 8334cd667bba..9f6ec00bd02e 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -24,6 +24,7 @@
>
> #define SZL sizeof(unsigned long)
> #define BPF_INSN_SAFETY 64
> +#define BPF_PPC_TAILCALL 8
>
> #define PLANT_INSTR(d, idx, instr) \
> do { if (d) { (d)[idx] = instr; } idx++; } while (0)
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index 5e976730b2f5..d51c696221d7 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -604,8 +604,8 @@ static void bpf_trampoline_setup_tail_call_cnt(u32 *image, struct codegen_contex
> int func_frame_offset, int r4_off)
> {
> if (IS_ENABLED(CONFIG_PPC64)) {
> - /* See bpf_jit_stack_tailcallcnt() */
> - int tailcallcnt_offset = 7 * 8;
> + /* See Generated stack layout */
> + int tailcallcnt_offset = BPF_PPC_TAILCALL;
>
> EMIT(PPC_RAW_LL(_R3, _R1, func_frame_offset - tailcallcnt_offset));
> EMIT(PPC_RAW_STL(_R3, _R1, -tailcallcnt_offset));
> @@ -620,7 +620,7 @@ static void bpf_trampoline_restore_tail_call_cnt(u32 *image, struct codegen_cont
> {
> if (IS_ENABLED(CONFIG_PPC64)) {
> /* See bpf_jit_stack_tailcallcnt() */
> - int tailcallcnt_offset = 7 * 8;
> + int tailcallcnt_offset = BPF_PPC_TAILCALL;
>
> EMIT(PPC_RAW_LL(_R3, _R1, -tailcallcnt_offset));
> EMIT(PPC_RAW_STL(_R3, _R1, func_frame_offset - tailcallcnt_offset));
> @@ -714,6 +714,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> * LR save area [ r0 save (64-bit) ] | header
> * [ r0 save (32-bit) ] |
> * dummy frame for unwind [ back chain 1 ] --
> + * [ tail_call_cnt ] optional - 64-bit powerpc
> * [ padding ] align stack frame
> * r4_off [ r4 (tailcallcnt) ] optional - 32-bit powerpc
> * alt_lr_off [ real lr (ool stub)] optional - actual lr
> @@ -795,6 +796,14 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> }
> }
>
> + /*
> + * Save tailcall count pointer at the same offset on the
> + * stack where subprogs expect it
> + */
> + if ((flags & BPF_TRAMP_F_CALL_ORIG) &&
> + (flags & BPF_TRAMP_F_TAIL_CALL_CTX))
> + bpf_frame_size += BPF_PPC_TAILCALL;
> +
The above hunk is relevant in the next patch where tailcalls support
in subprogs is added. Drop it here and move it to patch#2.
- Hari
Powered by blists - more mailing lists