lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200721191009.5khr7blivtuv3qfj@ast-mbp.dhcp.thefacebook.com>
Date:   Tue, 21 Jul 2020 12:10:09 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Song Liu <songliubraving@...com>
Cc:     linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
        netdev@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
        kernel-team@...com, john.fastabend@...il.com, kpsingh@...omium.org,
        brouer@...hat.com, peterz@...radead.org
Subject: Re: [PATCH v3 bpf-next 1/2] bpf: separate bpf_get_[stack|stackid]
 for perf events BPF

On Thu, Jul 16, 2020 at 03:59:32PM -0700, Song Liu wrote:
> +
> +BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx,
> +	   struct bpf_map *, map, u64, flags)
> +{
> +	struct perf_event *event = ctx->event;
> +	struct perf_callchain_entry *trace;
> +	bool has_kernel, has_user;
> +	bool kernel, user;
> +
> +	/* perf_sample_data doesn't have callchain, use bpf_get_stackid */
> +	if (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))

what if event was not created with PERF_SAMPLE_CALLCHAIN ?
Calling the helper will still cause crashes, no?

> +		return bpf_get_stackid((unsigned long)(ctx->regs),
> +				       (unsigned long) map, flags, 0, 0);
> +
> +	if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK |
> +			       BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
> +		return -EINVAL;
> +
> +	user = flags & BPF_F_USER_STACK;
> +	kernel = !user;
> +
> +	has_kernel = !event->attr.exclude_callchain_kernel;
> +	has_user = !event->attr.exclude_callchain_user;
> +
> +	if ((kernel && !has_kernel) || (user && !has_user))
> +		return -EINVAL;

this will break existing users in a way that will be very hard for them to debug.
If they happen to set exclude_callchain_* flags during perf_event_open
the helpers will be failing at run-time.
One can argue that when precise_ip=1 the bpf_get_stack is broken, but
this is a change in behavior.
It also seems to be broken when PERF_SAMPLE_CALLCHAIN was not set at event
creation time, but precise_ip=1 was.

> +
> +	trace = ctx->data->callchain;
> +	if (unlikely(!trace))
> +		return -EFAULT;
> +
> +	if (has_kernel && has_user) {

shouldn't it be || ?

> +		__u64 nr_kernel = count_kernel_ip(trace);
> +		int ret;
> +
> +		if (kernel) {
> +			__u64 nr = trace->nr;
> +
> +			trace->nr = nr_kernel;
> +			ret = __bpf_get_stackid(map, trace, flags);
> +
> +			/* restore nr */
> +			trace->nr = nr;
> +		} else { /* user */
> +			u64 skip = flags & BPF_F_SKIP_FIELD_MASK;
> +
> +			skip += nr_kernel;
> +			if (skip > BPF_F_SKIP_FIELD_MASK)
> +				return -EFAULT;
> +
> +			flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip;
> +			ret = __bpf_get_stackid(map, trace, flags);
> +		}
> +		return ret;
> +	}
> +	return __bpf_get_stackid(map, trace, flags);
...
> +	if (has_kernel && has_user) {
> +		__u64 nr_kernel = count_kernel_ip(trace);
> +		int ret;
> +
> +		if (kernel) {
> +			__u64 nr = trace->nr;
> +
> +			trace->nr = nr_kernel;
> +			ret = __bpf_get_stack(ctx->regs, NULL, trace, buf,
> +					      size, flags);
> +
> +			/* restore nr */
> +			trace->nr = nr;
> +		} else { /* user */
> +			u64 skip = flags & BPF_F_SKIP_FIELD_MASK;
> +
> +			skip += nr_kernel;
> +			if (skip > BPF_F_SKIP_FIELD_MASK)
> +				goto clear;
> +
> +			flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip;
> +			ret = __bpf_get_stack(ctx->regs, NULL, trace, buf,
> +					      size, flags);
> +		}

Looks like copy-paste. I think there should be a way to make it
into common helper.

I think the main isssue is wrong interaction with event attr flags.
I think the verifier should detect that bpf_get_stack/bpf_get_stackid
were used and prevent attaching to perf_event with attr.precise_ip=1
and PERF_SAMPLE_CALLCHAIN is not specified.
I was thinking whether attaching bpf to event can force setting of
PERF_SAMPLE_CALLCHAIN, but that would be a surprising behavior,
so not a good idea.
So the only thing left is to reject attach when bpf_get_stack is used
in two cases:
if attr.precise_ip=1 and PERF_SAMPLE_CALLCHAIN is not set.
  (since it will lead to crashes)
if attr.precise_ip=1 and PERF_SAMPLE_CALLCHAIN is set,
but exclude_callchain_[user|kernel]=1 is set too.
  (since it will lead to surprising behavior of bpf_get_stack)

Other ideas?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ