lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ca490974-0c5c-cfe9-0c6f-3ead163e7a7b@meta.com>
Date:   Sun, 18 Jun 2023 16:10:56 -0700
From:   Yonghong Song <yhs@...a.com>
To:     menglong8.dong@...il.com, alexei.starovoitov@...il.com
Cc:     ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
        martin.lau@...ux.dev, song@...nel.org, yhs@...com,
        john.fastabend@...il.com, kpsingh@...nel.org, sdf@...gle.com,
        haoluo@...gle.com, jolsa@...nel.org, benbjiang@...cent.com,
        bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
        Menglong Dong <imagedong@...cent.com>
Subject: Re: [PATCH bpf-next v5 2/3] bpf, x86: allow function arguments up to
 12 for TRACING



On 6/12/23 7:52 PM, menglong8.dong@...il.com wrote:
> From: Menglong Dong <imagedong@...cent.com>
> 
> For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
> on the kernel functions whose arguments count less than 6. This is not
> friendly at all, as too many functions have arguments count more than 6.
> 
> According to the current kernel version, below is a statistics of the
> function arguments count:
> 
> argument count | function count
> 7              | 704
> 8              | 270
> 9              | 84
> 10             | 47
> 11             | 47
> 12             | 27
> 13             | 22
> 14             | 5
> 15             | 0
> 16             | 1
> 
> Therefore, let's enhance it by increasing the function arguments count
> allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
> 
> For the case that we don't need to call origin function, which means
> without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
> that stored in the frame of the caller to current frame. The arguments
> of arg6-argN are stored in "$rbp + 0x18", we need copy them to
> "$rbp - regs_off + (6 * 8)".
> 
> For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
> in stack before call origin function, which means we need alloc extra
> "8 * (arg_count - 6)" memory in the top of the stack. Note, there should
> not be any data be pushed to the stack before call the origin function.
> Then, we have to store rbx with 'mov' instead of 'push'.

Please also mention special case related to 16-byte struct argument
in the comments of save_args().

> 
> We use EMIT3_off32() or EMIT4() for "lea" and "sub". The range of the
> imm in "lea" and "sub" is [-128, 127] if EMIT4() is used. Therefore,
> we use EMIT3_off32() instead if the imm out of the range.
> 
> It works well for the FENTRY/FEXIT/MODIFY_RETURN.
> 
> Signed-off-by: Menglong Dong <imagedong@...cent.com>
> ---
> v5:
> - consider the case of the struct in arguments can't be hold by regs
> v4:
> - make the stack 16-byte aligned if passing args on-stack is needed
> - add the function arguments statistics to the commit log
> v3:
> - use EMIT3_off32() for "lea" and "sub" only on necessary
> - make 12 as the maximum arguments count
> v2:
> - instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
> - make MAX_BPF_FUNC_ARGS as the maximum argument count
> ---
>   arch/x86/net/bpf_jit_comp.c | 221 +++++++++++++++++++++++++++++++-----
>   1 file changed, 195 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index a407fbbffecd..47c699594dd8 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1857,37 +1857,165 @@ st:			if (is_imm8(insn->off))
>   	return proglen;
>   }
>   
> -static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> -		      int stack_size)
> +static inline void clean_stack_garbage(const struct btf_func_model *m,
> +				       u8 **pprog, int nr_args_on_stack,

nr_args_on_stack is actually nr_stack_slots, right? Maybe rename to
nr_stack_slots?

> +				       int stack_size)
>   {
> -	int i;
> +	int arg_size, off;
> +	u8 *prog;
> +
> +	if (nr_args_on_stack != 1)
> +		return;
> +
> +	/* the size of the last argument */
> +	arg_size = m->arg_size[m->nr_args - 1];
> +
> +	/* Generally speaking, the compiler will pass the arguments
> +	 * on-stack with "push" instruction, which will take 8-byte
> +	 * on the stack. On this case, there won't be garbage values

On this case -> In this case. The same for below another case.

> +	 * while we copy the arguments from origin stack frame to current
> +	 * in BPF_DW.
> +	 *
> +	 * However, sometimes the compiler will only allocate 4-byte on
> +	 * the stack for the arguments. For now, this case will only
> +	 * happen if there is only one argument on-stack and its size
> +	 * not more than 4 byte. On this case, there will be garbage
> +	 * values on the upper 4-byte where we store the argument on
> +	 * current stack frame.
> +	 *
> +	 * arguments on origin stack:
> +	 *
> +	 * stack_arg_1(4-byte) xxx(4-byte)
> +	 *
> +	 * what we copy:
> +	 *
> +	 * stack_arg_1(8-byte): stack_arg_1(origin) xxx
> +	 *
> +	 * and the xxx is the garbage values which we should clean here.
> +	 */

let us put the above comments before
 > +	if (nr_args_on_stack != 1)
 > +		return;


> +	if (arg_size <= 4) {
> +		off = -(stack_size - 4);
> +		prog = *pprog;
> +		/* mov DWORD PTR [rbp + off], 0 */
> +		if (!is_imm8(off))
> +			EMIT2_off32(0xC7, 0x85, off);
> +		else
> +			EMIT3(0xC7, 0x45, off);
> +		EMIT(0, 4);
> +		*pprog = prog;
> +	}
> +}
> +
> +static void save_args(const struct btf_func_model *m, u8 **prog,
> +		      int stack_size, bool on_stack)

Rename 'on_stack' to 'for_call_origin'? This should be more
clear about the use case.

> +{
> +	int arg_regs, first_off, nr_regs = 0, nr_stack = 0;
nr_stack -> nr_stack_slots?
> +	int i, j;
>   
>   	/* Store function arguments to stack.
>   	 * For a function that accepts two pointers the sequence will be:
>   	 * mov QWORD PTR [rbp-0x10],rdi
>   	 * mov QWORD PTR [rbp-0x8],rsi
>   	 */
> -	for (i = 0; i < min(nr_regs, 6); i++)
> -		emit_stx(prog, BPF_DW, BPF_REG_FP,
> -			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -			 -(stack_size - i * 8));
> +	for (i = 0; i < min_t(int, m->nr_args, MAX_BPF_FUNC_ARGS); i++) {
> +		arg_regs = (m->arg_size[i] + 7) / 8;
> +
> +		/* According to the research of Yonghong, struct members
> +		 * should be all in register or all on the stack.
> +		 * Meanwhile, the compiler will pass the argument on regs
> +		 * if the remained regs can hold the argument.
remained -> remaining
> +		 *
> +		 * Disorder of the args can happen. For example:
> +		 *
> +		 * struct foo_struct {
> +		 *     long a;
> +		 *     int b;
> +		 * };
> +		 * int foo(char, char, char, char, char, struct foo_struct,
> +		 *         char);
> +		 *
> +		 * the arg1-5,arg7 will be passed by regs, and arg6 will
> +		 * by stack.
> +		 *
> +		 * Therefore, we should keep the same logic as here when
> +		 * we restore the regs in restore_regs.
> +		 */
> +		if (nr_regs + arg_regs > 6) {
> +			/* copy function arguments from origin stack frame
> +			 * into current stack frame.
> +			 *
> +			 * The starting address of the arguments on-stack
> +			 * is:
> +			 *   rbp + 8(push rbp) +
> +			 *   8(return addr of origin call) +
> +			 *   8(return addr of the caller)
> +			 * which means: rbp + 24
> +			 */
> +			for (j = 0; j < arg_regs; j++) {
> +				emit_ldx(prog, BPF_DW, BPF_REG_0, BPF_REG_FP,
> +					 nr_stack * 8 + 0x18);
> +				emit_stx(prog, BPF_DW, BPF_REG_FP, BPF_REG_0,
> +					 -stack_size);
> +
> +				if (!nr_stack)
> +					first_off = stack_size;
> +				stack_size -= 8;
> +				nr_stack++;
> +			}
> +		} else {
> +			/* Only copy the arguments on-stack to current
> +			 * 'stack_size' and ignore the regs, used to
> +			 * prepare the arguments on-stack for orign call.
> +			 */
> +			if (on_stack) {
> +				nr_regs += arg_regs;
> +				continue;
> +			}
> +
> +			/* copy the arguments from regs into stack */
> +			for (j = 0; j < arg_regs; j++) {
> +				emit_stx(prog, BPF_DW, BPF_REG_FP,
> +					 nr_regs == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> +					 -stack_size);
> +				stack_size -= 8;
> +				nr_regs++;
> +			}
> +		}
> +	}
> +
> +	clean_stack_garbage(m, prog, nr_stack, first_off);
>   }
>   
[...]
>   	/* Generated trampoline stack layout:
> @@ -2147,7 +2286,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   	 *
>   	 * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
>   	 *
> +	 * RBP - rbx_off   [ rbx value       ]  always
> +	 *
>   	 * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
> +	 *
> +	 *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
> +	 *                     [ ...        ]
> +	 *                     [ stack_arg2 ]
> +	 * RBP - arg_stack_off [ stack_arg1 ]
>   	 */
>   
>   	/* room for return value of orig_call or fentry prog */
> @@ -2167,9 +2313,25 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   
>   	ip_off = stack_size;
>   
> +	stack_size += 8;
> +	rbx_off = stack_size;
> +
>   	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
>   	run_ctx_off = stack_size;
>   
> +	if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG)) {
> +		stack_size += (nr_regs - 6) * 8;

Please double check. Is this okay for the case below?
   foo(int, int, int, int, int, 16_byte_struct)
here, nr_regs is 7, yes, to-be-increased stack size should be 2.


> +		/* make sure the stack pointer is 16-byte aligned if we
> +		 * need pass arguments on stack, which means
> +		 *  [stack_size + 8(rbp) + 8(rip) + 8(origin rip)]
> +		 * should be 16-byte aligned. Following code depend on
> +		 * that stack_size is already 8-byte aligned.
> +		 */
> +		stack_size += (stack_size % 16) ? 0 : 8;
> +	}
> +
> +	arg_stack_off = stack_size;
> +
[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ