linux-kernel - Re: [PATCH RFC bpf-next 3/4] bpf/selftests: add tests to validate proper arguments alignment on ARM64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D9I6TQN2I6B1.QC4FFHEWAURZ@bootlin.com>
Date: Mon, 28 Apr 2025 12:08:32 +0200
From: Alexis Lothoré <alexis.lothore@...tlin.com>
To: "Eduard Zingerman" <eddyz87@...il.com>, "Alexei Starovoitov"
 <ast@...nel.org>, "Daniel Borkmann" <daniel@...earbox.net>, "John
 Fastabend" <john.fastabend@...il.com>, "Andrii Nakryiko"
 <andrii@...nel.org>, "Martin KaFai Lau" <martin.lau@...ux.dev>, "Song Liu"
 <song@...nel.org>, "Yonghong Song" <yonghong.song@...ux.dev>, "KP Singh"
 <kpsingh@...nel.org>, "Stanislav Fomichev" <sdf@...ichev.me>, "Hao Luo"
 <haoluo@...gle.com>, "Jiri Olsa" <jolsa@...nel.org>, "Puranjay Mohan"
 <puranjay@...nel.org>, "Xu Kuohai" <xukuohai@...weicloud.com>, "Catalin
 Marinas" <catalin.marinas@....com>, "Will Deacon" <will@...nel.org>,
 "Mykola Lysenko" <mykolal@...com>, "Shuah Khan" <shuah@...nel.org>, "Maxime
 Coquelin" <mcoquelin.stm32@...il.com>, "Alexandre Torgue"
 <alexandre.torgue@...s.st.com>, "Florent Revest" <revest@...omium.org>
Cc: "Bastien Curutchet" <bastien.curutchet@...tlin.com>,
 <ebpf@...uxfoundation.org>, "Thomas Petazzoni"
 <thomas.petazzoni@...tlin.com>, <bpf@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
 <linux-kselftest@...r.kernel.org>,
 <linux-stm32@...md-mailman.stormreply.com>
Subject: Re: [PATCH RFC bpf-next 3/4] bpf/selftests: add tests to validate
 proper arguments alignment on ARM64

Hello Eduard,

On Mon Apr 28, 2025 at 9:01 AM CEST, Eduard Zingerman wrote:
> On Fri, 2025-04-11 at 22:32 +0200, Alexis Lothoré (eBPF Foundation) wrote:
>> When dealing with large types (>8 bytes), ARM64 trampolines need to take
>> extra care about the arguments alignment to respect the calling
>> convention set by AAPCS64.
>> 
>> Add two tests ensuring that the BPF trampoline arranges arguments with
>> the relevant layout. The two new tests involve almost the same
>> arguments, except that the second one requires a more specific alignment
>> to be set by the trampoline when preparing arguments before calling the
>> the target function.
>> 
>> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@...tlin.com>
>> ---
>
> [...]
>
>> +SEC("fentry/bpf_testmod_test_struct_arg_11")
>> +int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a,
>> +	      struct bpf_testmod_struct_arg_5, b,
>> +	      struct bpf_testmod_struct_arg_5, c,
>> +	      struct bpf_testmod_struct_arg_5, d, int, e,
>> +	      struct bpf_testmod_struct_arg_5, f)
>
> Hello Alexis,
>
> I'm trying to double check the error you've seen for x86.

Thanks for taking a look at this.

> I see that tracing_struct/struct_many_args fails with assertion:
> "test_struct_many_args:FAIL:t11:f unexpected t11:f: actual 35 != expected 43".
> Could you please help me understand this test?

Sure. When we attach fentry/fexit programs to a kernel function,  a small
trampoline is generated to handle the attached programs execution. This
trampoline has to:
1. properly read and aggregate the functions arguments into a bpf tracing
context. Those arguments comes from registers, and possibly from the stack
if not all function arguments fit in registers
2. if the trampoline is in charge of calling the target function (that's
the case for example when we have both a fentry and a fexit programs
attached), it must prepare the arguments for the target function, by
filling again the registers, and possibly pushing the additional arguments
on the stack at the relevant offset.

This test is about validating the first point: ensuring that the trampoline
has correctly read the initial arguments from the target function and
forwarded them to the fentry program. So the actual test triggers the
targeted function, it triggers the attached fentry program which record the
values passed by the trampoline in t11_a, t11_b, etc, and then the
test checks back that those "spied" values are the one that it has actually
passed when executing the target function.

> The function listened to is defined as accepting 'struct bpf_testmod_struct_arg_7',
> at the same time this function uses 'struct bpf_testmod_struct_arg_5'.

That's not an accidental mistake, those are in fact the same definition.
bpf_testmod_struct_arg_7 is the kernel side definition in bpf_testmod.c:

struct bpf_testmod_struct_arg_7 {
	__int128 a;
};

and struct bpf_testmode_struct_arg_5 is the one defined in the bpf test
program:

struct bpf_testmod_struct_arg_5 {
	__int128 a;
};

I agree while being correct, this is confusing :/ I have kind of blindly
applied the logic I have observed in here for declaring test structs, and
so declared it twice (in kernel part and in bpf part), just incrementing
the index of the last defined structure. But I guess it could benefit from
some index syncing, or better, some refactoring to properly share those
declarations. I'll think about it for a new rev.

> Nevertheless, the assertion persists even with correct types.

So I digged a bit further to better share my observations here. This is the
function stack when entering the trampoline after having triggered the
target function execution:

(gdb) x/64b $rbp+0x18
0xffffc9000015fd60:     41      0       0       0       0       0       0       0
0xffffc9000015fd68:     0       0       0       0       0       0       0       0
0xffffc9000015fd70:     42      0       0       0       0       0       0       0
0xffffc9000015fd78:     35      0       0       0       0       0       0       0
0xffffc9000015fd80:     43      0       0       0       0       0       0       0
0xffffc9000015fd88:     0       0       0       0       0       0       0       0

We see the arguments that did not fit in registers, so d, e and f.

This is the ebpf context generated by the trampoline for the fentry
program, from the content of the stack above + the registers:

(gdb) x/128b $rbp-60
0xffffc9000015fce8:     38      0       0       0       0       0       0       0
0xffffc9000015fcf0:     0       0       0       0       0       0       0       0
0xffffc9000015fcf8:     39      0       0       0       0       0       0       0
0xffffc9000015fd00:     0       0       0       0       0       0       0       0
0xffffc9000015fd08:     40      0       0       0       0       0       0       0
0xffffc9000015fd10:     0       0       0       0       0       0       0       0
0xffffc9000015fd18:     41      0       0       0       0       0       0       0
0xffffc9000015fd20:     0       0       0       0       0       0       0       0
0xffffc9000015fd28:     42      0       0       0       0       0       0       0
0xffffc9000015fd30:     35      0       0       0       0       0       0       0
0xffffc9000015fd38:     43      0       0       0       0       0       0       0
0xffffc9000015fd40:     37      0       0       0       0       0       0       0

So IIUC, this is wrong because the "e" variable in the bpf program being
an int (and about to receive value 42), it occupies only 1 "tracing context
8-byte slot", so the value 43 (representing the content for variable f),
should be right after it, at 0xffffc9000015fd30. What we have instead is a
hole, very likely because we copied silently the alignment from the
original function call (and I guess this 35 value is a remnant from the
previous test, which uses values from 27 to 37)

Regardless of this issue, based on discussion from last week, I think I'll
go for the implementation suggested by Alexei: handling the nominal cases,
and detecting and blocking the non trivial cases (eg: structs passed on
stack). It sounds reasonable as there seems to be no exisiting kernel
function currently able to trigger those very specific cases, so it could
be added later if this changes.

Alexis
-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com