linux-kernel - RE: [PATCH v3 bpf-next 1/3] bpf: Add bpf_perf_prog_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5e2a3f00a996a_7f9e2ab8c3f9e5c4a6@john-XPS-13-9370.notmuch>
Date:   Thu, 23 Jan 2020 16:49:04 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Daniel Xu <dxu@...uu.xyz>, bpf@...r.kernel.org, ast@...nel.org,
        daniel@...earbox.net, songliubraving@...com, yhs@...com,
        andriin@...com
Cc:     Daniel Xu <dxu@...uu.xyz>, linux-kernel@...r.kernel.org,
        kernel-team@...com, peterz@...radead.org, mingo@...hat.com,
        acme@...nel.org
Subject: RE: [PATCH v3 bpf-next 1/3] bpf: Add bpf_perf_prog_read_branches()
 helper

Daniel Xu wrote:
> Branch records are a CPU feature that can be configured to record
> certain branches that are taken during code execution. This data is
> particularly interesting for profile guided optimizations. perf has had
> branch record support for a while but the data collection can be a bit
> coarse grained.
> 
> We (Facebook) have seen in experiments that associating metadata with
> branch records can improve results (after postprocessing). We generally
> use bpf_probe_read_*() to get metadata out of userspace. That's why bpf
> support for branch records is useful.
> 
> Aside from this particular use case, having branch data available to bpf
> progs can be useful to get stack traces out of userspace applications
> that omit frame pointers.
> 
> Signed-off-by: Daniel Xu <dxu@...uu.xyz>
> ---
>  include/uapi/linux/bpf.h | 15 ++++++++++++++-
>  kernel/trace/bpf_trace.c | 31 +++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 

[...]

>   * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 19e793aa441a..24c51272a1f7 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1028,6 +1028,35 @@ static const struct bpf_func_proto bpf_perf_prog_read_value_proto = {
>           .arg3_type      = ARG_CONST_SIZE,
>  };
>  
> +BPF_CALL_3(bpf_perf_prog_read_branches, struct bpf_perf_event_data_kern *, ctx,
> +	   void *, buf, u32, size)
> +{
> +	struct perf_branch_stack *br_stack = ctx->data->br_stack;
> +	u32 to_copy = 0, to_clear = size;
> +	int err = -EINVAL;
> +
> +	if (unlikely(!br_stack))
> +		goto clear;
> +
> +	to_copy = min_t(u32, br_stack->nr * sizeof(struct perf_branch_entry), size);
> +	to_clear -= to_copy;
> +
> +	memcpy(buf, br_stack->entries, to_copy);
> +	err = to_copy;
> +clear:

There appears to be agreement to clear the extra buffer on error but what about
in the non-error case? I expect one usage pattern is to submit a fairly large
buffer, large enough to handle worse case nr, in this case we end up zero'ing
memory even in the succesful case. Can we skip the clear in this case? Maybe
its not too important either way but seems unnecessary.

> +	memset(buf + to_copy, 0, to_clear);
> +	return err;
> +}