netdev - Re: [RFC][PATCH bpf 1/2] bpf: allow 64-bit offsets for bpf function calls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 08 Feb 2018 23:29:30 +0530
From:   "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>
To:     Alexei Starovoitov <ast@...com>, daniel@...earbox.net,
        Sandipan Das <sandipan@...ux.vnet.ibm.com>
Cc:     linuxppc-dev@...ts.ozlabs.org, mpe@...erman.id.au,
        netdev@...r.kernel.org
Subject: Re: [RFC][PATCH bpf 1/2] bpf: allow 64-bit offsets for bpf function
 calls

Alexei Starovoitov wrote:
> On 2/8/18 4:03 AM, Sandipan Das wrote:
>> The imm field of a bpf_insn is a signed 32-bit integer. For
>> JIT-ed bpf-to-bpf function calls, it stores the offset from
>> __bpf_call_base to the start of the callee function.
>>
>> For some architectures, such as powerpc64, it was found that
>> this offset may be as large as 64 bits because of which this
>> cannot be accomodated in the imm field without truncation.
>>
>> To resolve this, we additionally use the aux data within each
>> bpf_prog associated with the caller functions to store the
>> addresses of their respective callees.
>>
>> Signed-off-by: Sandipan Das <sandipan@...ux.vnet.ibm.com>
>> ---
>>  kernel/bpf/verifier.c | 39 ++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 5fb69a85d967..52088b4ca02f 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -5282,6 +5282,19 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>  	 * run last pass of JIT
>>  	 */
>>  	for (i = 0; i <= env->subprog_cnt; i++) {
>> +		u32 flen = func[i]->len, callee_cnt = 0;
>> +		struct bpf_prog **callee;
>> +
>> +		/* for now assume that the maximum number of bpf function
>> +		 * calls that can be made by a caller must be at most the
>> +		 * number of bpf instructions in that function
>> +		 */
>> +		callee = kzalloc(sizeof(func[i]) * flen, GFP_KERNEL);
>> +		if (!callee) {
>> +			err = -ENOMEM;
>> +			goto out_free;
>> +		}
>> +
>>  		insn = func[i]->insnsi;
>>  		for (j = 0; j < func[i]->len; j++, insn++) {
>>  			if (insn->code != (BPF_JMP | BPF_CALL) ||
>> @@ -5292,6 +5305,26 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>  			insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
>>  				func[subprog]->bpf_func -
>>  				__bpf_call_base;
>> +
>> +			/* the offset to the callee from __bpf_call_base
>> +			 * may be larger than what the 32 bit integer imm
>> +			 * can accomodate which will truncate the higher
>> +			 * order bits
>> +			 *
>> +			 * to avoid this, we additionally utilize the aux
>> +			 * data of each caller function for storing the
>> +			 * addresses of every callee associated with it
>> +			 */
>> +			callee[callee_cnt++] = func[subprog];
> 
> can you share typical /proc/kallsyms ?
> Are you saying that kernel and kernel modules are allocated from
> address spaces that are always more than 32-bit apart?

Yes. On ppc64, kernel text is linearly mapped from 0xc000000000000000, 
while vmalloc'ed area starts from 0xd000000000000000 (for radix, this is
different, but still beyond a 32-bit offset).

> That would mean that all kernel calls into modules are far calls
> and the other way around form .ko into kernel?
> Performance is probably suffering because every call needs to be built
> with full 64-bit offset. No ?

Possibly, and I think Michael can give a better perspective, but I think
this is due to our ABI. For inter-module calls, we need to setup the TOC
pointer (or the address of the function being called with ABIv2), which 
would require us to load a full address regardless.

- Naveen