lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 13 Feb 2014 03:08:57 +0400
From:	Andrew Vagin <avagin@...allels.com>
To:	Kees Cook <keescook@...omium.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Andrey Vagin <avagin@...nvz.org>,
	LKML <linux-kernel@...r.kernel.org>, <criu@...nvz.org>,
	Oleg Nesterov <oleg@...hat.com>, Robin Holt <holt@....com>,
	Al Viro <viro@...iv.linux.org.uk>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	"Chen Gang" <gang.chen@...anux.com>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	"Pavel Emelyanov" <xemul@...allels.com>,
	Aditya Kali <adityakali@...gle.com>,
	"Michael Kerrisk" <mtk.manpages@...il.com>
Subject: Re: [PATCH] kernel: reduce required permission for prctl_set_mm

On Wed, Feb 12, 2014 at 01:50:35PM -0800, Kees Cook wrote:
> On Wed, Feb 12, 2014 at 1:32 PM, Andrew Morton
> <akpm@...ux-foundation.org> wrote:
> > On Wed, 12 Feb 2014 19:40:11 +0400 Andrey Vagin <avagin@...nvz.org> wrote:
> >
> >> Currently prctl_set_mm requires the global CAP_SYS_RESOURCE,
> >> this patch reduce requiremence to CAP_SYS_RESOURCE in the current
> >> namespace.
> >>
> >> When we restore a task we need to set up text, data and data heap sizes
> >> from userspace to the values a task had at checkpoint time.
> >>
> >> Currently we can not restore these parameters, if a task lives in
> >> a non-root user name space, because it has no capabilities in the
> >> parent namespace.
> >>
> >> prctl_set_mm() changes parameters of the current task and doesn't affect
> >> other tasks.
> >>
> >> This patch affects the RLIMIT_DATA limit, because a consumtiuon is
> >> calculated relatively to mm->end_data, mm->start_data, mm->start_brk.
> >
> > I can't for the life of me work out what you were trying to say here.
> > Please fix and resend this paragraph?
> >
> >> rlim = rlimit(RLIMIT_DATA);
> >> if (rlim < RLIM_INFINITY && (brk - mm->start_brk) +
> >>               (mm->end_data - mm->start_data) > rlim)
> >>       goto out;
> >>
> >> This limit affects calls to brk() and sbrk(), but it doesn't affect
> >> mmap. So I think requirement of CAP_SYS_RESOURCE in the current
> >> namespace is enough for this limit.
> >>
> >> ...
> >>
> >> Cc: security@...nel.org
> >
> > That list is for reporting kernel security bugs.
> >
> >>
> >> --- a/kernel/sys.c
> >> +++ b/kernel/sys.c
> >> @@ -1701,7 +1701,7 @@ static int prctl_set_mm(int opt, unsigned long addr,
> >>       if (arg5 || (arg4 && opt != PR_SET_MM_AUXV))
> >>               return -EINVAL;
> >>
> >> -     if (!capable(CAP_SYS_RESOURCE))
> >> +     if (!ns_capable(current_user_ns(), CAP_SYS_RESOURCE))
> >>               return -EPERM;
> >>
> >>       if (opt == PR_SET_MM_EXE_FILE)
> >
> > This looks harmless.
> 
> I want to be convinced of this, but weakening this cap check seems
> like an easy way for a process to hide itself trivially from the real
> root user. It can change it's exe file link, and dodge RLIMIT_DATA by
> changing the brk addresses. The whole reason this cap check was there
> was to stop that kind of thing. Limiting it to a namespace isn't great
> since USER_NS means unprivileged processes can enter a new NS as the
> NS root user.

All what you are describing here we are doing on restoring tasks. We
need a way how to restore these parameters. One of our targets is to be
able to dump and restore Linux Containers. All processes of a container
live in a separate set of namespaces.

I was thinking to restore these parameters before entering into userns,
but this idea failed, because a process can't enter in pidns, but pidns
must be created in userns...


>> It can change it's exe file link
We can change memory content with help of ptrace. So if we want to hide
a process, we can execute another process and inject our code into it.

It can be equivalent to changing exe file link. Yes, it's a bit
harder, but we can do that even without this patch.

>> dodge RLIMIT_DATA

This limit affects calls to brk(2) and sbrk(2). But a task can use mmap() to
allocate memory. How is this limit used?

Sorry if I miss something.

> 
> -Kees
> 
> -- 
> Kees Cook
> Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ