lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150613235250.GA25252@redhat.com>
Date:	Sun, 14 Jun 2015 01:52:50 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tycho Andersen <tycho.andersen@...onical.com>
Cc:	linux-kernel@...r.kernel.org, linux-api@...r.kernel.org,
	Kees Cook <keescook@...omium.org>,
	Andy Lutomirski <luto@...capital.net>,
	Will Drewry <wad@...omium.org>,
	Roland McGrath <roland@...k.frob.com>,
	Pavel Emelyanov <xemul@...allels.com>,
	"Serge E. Hallyn" <serge.hallyn@...ntu.com>
Subject: Re: [PATCH v5] seccomp: add ptrace options for suspend/resume

On 06/13, Tycho Andersen wrote:
>
> This patch is the first step in enabling checkpoint/restore of processes
> with seccomp enabled.

So just in case, I am fine with this version.

> One of the things CRIU does while dumping tasks is inject code into them
> via ptrace to collect information that is only available to the process
> itself. However, if we are in a seccomp mode where these processes are
> prohibited from making these syscalls, then what CRIU does kills the task.
> 
> This patch adds a new ptrace option, PTRACE_O_SUSPEND_SECCOMP, that enables
> a task from the init user namespace which has CAP_SYS_ADMIN and no seccomp
> filters to disable (and re-enable) seccomp filters for another task so that
> they can be successfully dumped (and restored). We restrict the set of
> processes that can disable seccomp through ptrace because although today
> ptrace can be used to bypass seccomp, there is some discussion of closing
> this loophole in the future and we would like this patch to not depend on
> that behavior and be future proofed for when it is removed.
> 
> Note that seccomp can be suspended before any filters are actually
> installed; this behavior is useful on criu restore, so that we can suspend
> seccomp, restore the filters, unmap our restore code from the restored
> process' address space, and then resume the task by detaching and have the
> filters resumed as well.
> 
> v2 changes:
> 
> * require that the tracer have no seccomp filters installed
> * drop TIF_NOTSC manipulation from the patch
> * change from ptrace command to a ptrace option and use this ptrace option
>   as the flag to check. This means that as soon as the tracer
>   detaches/dies, seccomp is re-enabled and as a corrollary that one can not
>   disable seccomp across PTRACE_ATTACHs.
> 
> v3 changes:
> 
> * get rid of various #ifdefs everywhere
> * report more sensible errors when PTRACE_O_SUSPEND_SECCOMP is incorrectly
>   used
> 
> v4 changes:
> 
> * get rid of may_suspend_seccomp() in favor of a capable() check in ptrace
>   directly
> 
> v5 changes:
> 
> * check that seccomp is not enabled (or suspended) on the tracer
> 
> Signed-off-by: Tycho Andersen <tycho.andersen@...onical.com>
> CC: Kees Cook <keescook@...omium.org>
> CC: Andy Lutomirski <luto@...capital.net>
> CC: Will Drewry <wad@...omium.org>
> CC: Roland McGrath <roland@...k.frob.com>
> CC: Oleg Nesterov <oleg@...hat.com>
> CC: Pavel Emelyanov <xemul@...allels.com>
> CC: Serge E. Hallyn <serge.hallyn@...ntu.com>
> ---
>  include/linux/ptrace.h      |  1 +
>  include/uapi/linux/ptrace.h |  6 ++++--
>  kernel/ptrace.c             | 13 +++++++++++++
>  kernel/seccomp.c            |  8 ++++++++
>  4 files changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 987a73a..061265f 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -34,6 +34,7 @@
>  #define PT_TRACE_SECCOMP	PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
>  
>  #define PT_EXITKILL		(PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
> +#define PT_SUSPEND_SECCOMP	(PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
>  
>  /* single stepping state bits (used on ARM and PA-RISC) */
>  #define PT_SINGLESTEP_BIT	31
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index cf1019e..a7a6979 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -89,9 +89,11 @@ struct ptrace_peeksiginfo_args {
>  #define PTRACE_O_TRACESECCOMP	(1 << PTRACE_EVENT_SECCOMP)
>  
>  /* eventless options */
> -#define PTRACE_O_EXITKILL	(1 << 20)
> +#define PTRACE_O_EXITKILL		(1 << 20)
> +#define PTRACE_O_SUSPEND_SECCOMP	(1 << 21)
>  
> -#define PTRACE_O_MASK		(0x000000ff | PTRACE_O_EXITKILL)
> +#define PTRACE_O_MASK		(\
> +	0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
>  
>  #include <asm/ptrace.h>
>  
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index c8e0e05..496028b 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -556,6 +556,19 @@ static int ptrace_setoptions(struct task_struct *child, unsigned long data)
>  	if (data & ~(unsigned long)PTRACE_O_MASK)
>  		return -EINVAL;
>  
> +	if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
> +		if (!config_enabled(CONFIG_CHECKPOINT_RESTORE) ||
> +		    !config_enabled(CONFIG_SECCOMP))
> +			return -EINVAL;
> +
> +		if (!capable(CAP_SYS_ADMIN))
> +			return -EPERM;
> +
> +		if (current->seccomp.mode != SECCOMP_MODE_DISABLED ||
> +		    current->ptrace & PT_SUSPEND_SECCOMP)
> +			return -EPERM;
> +	}
> +
>  	/* Avoid intermediate state when all opts are cleared */
>  	flags = child->ptrace;
>  	flags &= ~(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT);
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 980fd26..645e42d 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -590,6 +590,10 @@ void secure_computing_strict(int this_syscall)
>  {
>  	int mode = current->seccomp.mode;
>  
> +	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +		return;
> +
>  	if (mode == 0)
>  		return;
>  	else if (mode == SECCOMP_MODE_STRICT)
> @@ -691,6 +695,10 @@ u32 seccomp_phase1(struct seccomp_data *sd)
>  	int this_syscall = sd ? sd->nr :
>  		syscall_get_nr(current, task_pt_regs(current));
>  
> +	if (config_enabled(CONFIG_CHECKPOINT_RESTORE) &&
> +	    unlikely(current->ptrace & PT_SUSPEND_SECCOMP))
> +		return SECCOMP_PHASE1_OK;
> +
>  	switch (mode) {
>  	case SECCOMP_MODE_STRICT:
>  		__secure_computing_strict(this_syscall);  /* may call do_exit */
> -- 
> 2.1.4
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ