[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <754bf668-2151-5a5c-c03d-b97c7da012a5@iogearbox.net>
Date: Sun, 28 Jan 2018 21:51:37 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>, ast@...nel.org
Cc: netdev@...r.kernel.org, Sandipan Das <sandipan@...ux.vnet.ibm.com>
Subject: Re: [PATCH bpf-next 01/13] bpf: xor of a/x in cbpf can be done
On 01/28/2018 07:58 PM, Naveen N. Rao wrote:
> in 32 bit alu
>
> Daniel Borkmann wrote:
>> Very minor optimization; saves 1 byte per program in x86_64
>> JIT in cBPF prologue.
>
> ... but increases program size by 4 bytes on ppc64 :(
> In general, this is an area I've been wanting to spend some time on. Powerpc doesn't have 32-bit sub-registers, so we need to emit an additional instruction to clear the higher 32-bits for all 32-bit operations. I need to look at the performance impact.
Right, I think one way to optimize this could be on JIT level in such
case when CPU doesn't have subregs. There is the bpf_prog_was_classic()
helper that can be used there in order to skip some of the bpf_alu32_trunc
goto cases e.g. for some of the bit ops as an example, since we know that
upper part in cBPF must be zero here anyway, this should definitely be a
low hanging fruit given we use alu32 in the cBPF to eBPF conversion in a
lot of places.
Thanks,
Daniel
Powered by blists - more mailing lists