lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180702154655.GR2494@hirez.programming.kicks-ass.net>
Date:   Mon, 2 Jul 2018 17:46:55 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Jin Yao <yao.jin@...ux.intel.com>, boris.ostrovsky@...cle.com
Subject: Re: [RFC PATCH] perf/core: don't sample kernel regs upon skid

On Mon, Jul 02, 2018 at 04:12:50PM +0100, Mark Rutland wrote:
> Users can request that general purpose registers, instruction pointer,
> etc, are sampled when a perf event counter overflows. To try to avoid
> this resulting in kernel state being leaked, unprivileged users are
> usually forbidden from opening events which count while the kernel is
> running.
> 
> Unfortunately, this is not sufficient to avoid leading kernel state.

'leaking' surely.

> 
> For various reasons, there can be a delay between the overflow occurring
> and the resulting overflow exception (e.g. an NMI) being taken. During
> this window, other instructions may be executed, resulting in skid.
> 
> This skid means that a userspace-only event overflowing may result in an
> exception being taken *after* entry to the kernel, allowing kernel
> registers to be sampled. Depending on the amount of skid, this may only
> leak the PC (breaking KASLR), or it may leak secrets which are currently
> live in GPRs.
> 
> Let's avoid this by only sampling from the user registers when an event
> is supposed to exclude the kernel, providing the illusion that the
> overflow exception is taken from userspace.
> 
> We also have similar cases when sampling a guest, where we get the host
> regs in some cases. It's not entirely clear to me how we should handle
> these.

Would not a similar:

	if ((event->attr.exclude_hv || event->attr.exclude_host) /* WTF both !? */ &&
	    perf_guest_cbs && !perf_guest_cbs->is_in_guest())
		return perf_guest_cbs->guest_pt_regs();

work there? Of course, perf_guest_info_callbacks is currently lacking
that guest_pt_regs() thingy..

> 
> Signed-off-by: Mark Rutland <mark.rutland@....com>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Jin Yao <yao.jin@...ux.intel.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> ---
>  kernel/events/core.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8f0434a9951a..2ab2548b2e66 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6361,6 +6361,32 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
>  	return callchain ?: &__empty_callchain;
>  }
>  
> +static struct pt_regs *perf_get_sample_regs(struct perf_event *event,
> +					    struct pt_regs *regs)
> +{
> +	/*
> +	 * Due to interrupt latency (AKA "skid"), we may enter the kernel
> +	 * before taking an overflow, even if the PMU is only counting user
> +	 * events.
> +	 *
> +	 * If we're not counting kernel events, always use the user regs when
> +	 * sampling.
> +	 *
> +	 * TODO: what do we do about sampling a guest's registers? The IP is
> +	 * special-cased, but for the rest of the regs they'll get the
> +	 * user/kernel regs depending on whether exclude_kernel is set, which
> +	 * is nonsensical.
> +	 *
> +	 * We can't get at the full set of regs in all cases (e.g. Xen's PV PMU
> +	 * can't provide the GPRs), so should we just zero the GPRs when in a
> +	 * guest? Or skip outputting the regs in perf_output_sample?

Seems daft Xen cannot provide registers; why is that? Boris?

> +	 */
> +	if (event->attr.exclude_kernel && !user_mode(regs))
> +		return task_pt_regs(current);
> +
> +	return regs;
> +}
> +
>  void perf_prepare_sample(struct perf_event_header *header,
>  			 struct perf_sample_data *data,
>  			 struct perf_event *event,
> @@ -6368,6 +6394,8 @@ void perf_prepare_sample(struct perf_event_header *header,
>  {
>  	u64 sample_type = event->attr.sample_type;
>  
> +	regs = perf_get_sample_regs(event, regs);
> +
>  	header->type = PERF_RECORD_SAMPLE;
>  	header->size = sizeof(*header) + event->header_size;

In any case ACK for this thing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ