lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d776261-338b-4ebb-bb9b-1dbc91cd06c3@huawei.com>
Date: Tue, 30 Jan 2024 17:14:13 +0800
From: Pu Lehui <pulehui@...wei.com>
To: Björn Töpel <bjorn@...nel.org>, Pu Lehui
	<pulehui@...weicloud.com>, <bpf@...r.kernel.org>,
	<linux-riscv@...ts.infradead.org>, <netdev@...r.kernel.org>
CC: Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
	<daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
	<martin.lau@...ux.dev>, Song Liu <song@...nel.org>, Yonghong Song
	<yhs@...com>, John Fastabend <john.fastabend@...il.com>, KP Singh
	<kpsingh@...nel.org>, Stanislav Fomichev <sdf@...gle.com>, Hao Luo
	<haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>, Palmer Dabbelt
	<palmer@...belt.com>, Conor Dooley <conor@...nel.org>, Luke Nelson
	<luke.r.nels@...il.com>
Subject: Re: [PATCH bpf-next 4/4] riscv, bpf: Mixing bpf2bpf and tailcalls



On 2024/1/30 16:29, Björn Töpel wrote:
> Pu Lehui <pulehui@...weicloud.com> writes:
> 
>> On 2023/9/28 17:59, Björn Töpel wrote:
>>> Pu Lehui <pulehui@...weicloud.com> writes:
>>>
>>>> From: Pu Lehui <pulehui@...wei.com>
>>>>
>>>> In the current RV64 JIT, if we just don't initialize the TCC in subprog,
>>>> the TCC can be propagated from the parent process to the subprocess, but
>>>> the TCC of the parent process cannot be restored when the subprocess
>>>> exits. Since the RV64 TCC is initialized before saving the callee saved
>>>> registers into the stack, we cannot use the callee saved register to
>>>> pass the TCC, otherwise the original value of the callee saved register
>>>> will be destroyed. So we implemented mixing bpf2bpf and tailcalls
>>>> similar to x86_64, i.e. using a non-callee saved register to transfer
>>>> the TCC between functions, and saving that register to the stack to
>>>> protect the TCC value. At the same time, we also consider the scenario
>>>> of mixing trampoline.
>>>
>>> Hi!
>>>
>>> The RISC-V JIT tries to minimize the stack usage, e.g. it doesn't have a
>>> fixed pro/epilogue like some of the other JITs. I think we can do better
>>> here, so that the pass-TCC-via-register can be used, and the additional
>>> stack access can be avoided.
>>>
>>> Today, the TCC is passed via a register (a6) and can be viewed as a
>>> "state" variable/transparent argument/return value. As you point out, we
>>> loose this when we do a call. On (any) calls we move the TCC to a
>>> callee-saved register.
>>>
>>> WDYT about the following scheme:
>>>
>>> 1 Pickup the arm64 bpf2bpf/tailmix mechanism of just clearing the TCC
>>>     for the main program.
>>> 2 For BPF helper calls, move TCC to s6, perform the call, and restore
>>>     a6. Dito for kfunc calls (BPF_PSEUDO_KFUNC_CALL).
>>> 3 For all other calls, a6 is passed transparently.
>>>
>>> For 2 bpf_jit_get_func_addr() can be used to determine if the callee is
>>> a BPF helper or not.
>>>
>>> In summary; Determine in the JIT if we're leaving BPF-land, and need to
>>> move the TCC to a callee-saved reg, or not, and save us a bunch of stack
>>> store/loads.
>>>
>>
>> Valuable scheme. But we need to consider TCC back propagation. Let me
>> show an example of calling subprog with TCC stored in A6:
>>
>> prog1(TCC==1){
>>       subprog1(TCC==1)
>>           -> tailcall1(TCC==0)
>>               -> subprog2(TCC==0)
>>       subprog3(TCC==0) <--- should be TCC==1
>>           -\-> tailcall2 <--- can't be called
>> }

Let's back with this example again. Imagine that the tailcall chain is a 
list limited to 33 elements. When the list has 32 elements, we call 
subprog1 and then tailcall1. At this time, the list elements count 
becomes 33. Then we call subprog2 and return prog1. At this time, the 
list removes 1 element and becomes 32 elements. At this time, there 
still can perform 1 tailcall.

I've attached a diagram that shows mixing tailcall and subprogs is 
nearly a "call". It can return to caller function.

>>
>> We call prog1 and TCC is 1. prog1 has two subprogs, subprog1 and
>> subprog3. subprog1 calls tailcall1 and TCC become to 0. tailcall1 call
>> subprog2 and then return to prog1 with TCC is 0. At this time, subprog3
>> cannot call tailcall2 because TCC is 0. But TCC should be 1 here.
> 
> Huh, I'm not following, and I don't see the issue. Help me out! You're
> only allowed to do X tail calls "globally" for a BPF context, right? So
> in the example you're outlining above, tailcall2 shouldn't be allowed to
> be called.
> 
> 
> Björn
Download attachment "bpf2bpf&tailcall.png" of type "image/png" (116596 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ