[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24d2c526-3f9c-3101-ef60-d92777ebdcb7@fb.com>
Date: Tue, 26 Feb 2019 04:27:22 +0000
From: Alexei Starovoitov <ast@...com>
To: Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: Roman Gushchin <guro@...com>, Alexei Starovoitov <ast@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/4] bpf: enable program stats
On 2/25/19 2:36 AM, Daniel Borkmann wrote:
>
> Not through the stack, but was more thinking something like low-overhead
> kprobes-style extension for a BPF prog where such sequence would be added
> 'inline' at beginning / exit of BPF prog invocation with normal ctx access
> and helpers as the native program (e.g. time-stamping plus read/write into
> mark as one example which kprobes couldn't do). But might go much beyond
> context of this stats collection.
see below.
> That's actually an interesting thought, given the original prog->bpf_func
> is a known address at that time, this could be templated where an inner
> dummy bpf_func call to the normal BPF prog gets later replaced with the
> actual prog->bpf_func address to have no indirect call. This would also
> allow to keep the actual prog->bpf_func entry call-sites small and simple.
My first thought was that we indeed can have a template of stats
collecting wrapper in .text,
then at 'switch stats on' event vmalloc exec region, copy wrapper code
into it and manually patch 'call foo' x86 insn with direct jump.
That is still quite involved from coding side.
It's similar to what kprobe/ftrace is doing, but probably
an overkill for stats due to potential security
concerns with new executable code.
But it won't work due to tail_calls.
I've looked into single wrapper approach with indirect call
(because it's much easer than patching x86) and realized it
won't work for the same reason.
Inline kprobe+kretprobe insertion won't work either.
All because we have tail_calls.
we cannot have executable epilogue in progs.
tail_call will jmp into next prog and second part of stats collection
won't run. Essentially no epilogue allowed unless we disable tail_call.
Or we somehow reimplement tail_calls so that prog returns,
an epilogue is executed and then jumps into the next prog.
tail_call turned out be the worst corner case feature to deal with.
With static_key approach the main prog accounts the time
of all progs called via tail_call.
We can argue whether it's ideal semantics, but looks like
it's the only thing we can do.
I don't see a way how tail_called progs can count their own time.
Stats would somehow need to be computed right before jumping
into next prog. Which means heavy changes in all JITs
plus extra performance cost at runtime, since such if (stats_on)
check in tail_call handling would have to be done at runtime,
since JITed code is read-only and regenerating progs due to stats
is non-starter.
Or tail-called next prog would need to update stats for previous
prog, but there is no pointer to 'aux->stats' there.
So imo that is dead end.
static_key approach is the simplest by far.
Powered by blists - more mailing lists