linux-kernel - Re: [PATCH 11/11] seccomp: Add tgid and tid into seccomp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CALCETrXJY9CXoXckOpVx9fNXcT2UYPkkQdBTk4LYbhf1jq=eqA@mail.gmail.com>
Date:	Wed, 30 Jul 2014 07:52:33 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Paolo Bonzini <pbonzini@...hat.com>,
	Greg KH <gregkh@...uxfoundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	James Morris <james.l.morris@...cle.com>,
	Paul Moore <paul@...l-moore.com>,
	LSM List <linux-security-module@...r.kernel.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	David Drysdale <drysdale@...gle.com>,
	Linux API <linux-api@...r.kernel.org>,
	Kees Cook <keescook@...omium.org>,
	Meredydd Luff <meredydd@...atehouse.org>,
	Julien Tinnes <jln@...gle.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH 11/11] seccomp: Add tgid and tid into seccomp_data

On Jul 29, 2014 10:57 PM, "Eric W. Biederman" <ebiederm@...ssion.com> wrote:
>
> Andy Lutomirski <luto@...capital.net> writes:
>
> > On Tue, Jul 29, 2014 at 9:08 PM, Eric W. Biederman
> > <ebiederm@...ssion.com> wrote:
> >> Andy Lutomirski <luto@...capital.net> writes:
> >>
> >>> On Mon, Jul 28, 2014 at 2:18 PM, Eric W. Biederman
> >>> <ebiederm@...ssion.com> wrote:
> >>>> Andy Lutomirski <luto@...capital.net> writes:
> >>>>
> >>>>> [cc: Eric Biederman]
> >>>>>
> >>>>
> >>>>> Can we do one better and add a flag to prevent any non-self pid
> >>>>> lookups?  This might actually be easy on top of the pid namespace work
> >>>>> (e.g. we could change the way that find_task_by_vpid works).
> >>>>>
> >>>>> It's far from just being signals.  There's access_process_vm, ptrace,
> >>>>> all the signal functions, clock_gettime (see CPUCLOCK_PID -- yes, this
> >>>>> is ridiculous), and probably some others that I've forgotten about or
> >>>>> never noticed in the first place.
> >>>>
> >>>> So here is the practical question.
> >>>>
> >>>> Are these processes that only can send signals to their thread group
> >>>> allowed to call fork()?
> >>>>
> >>>>
> >>>> If fork is allowed and all pid lookups are restricted to their own
> >>>> thread group that wait, waitpid, and all of the rest of the wait family
> >>>> will never return the pids of their children, and zombies will
> >>>> accumulate.  Aka the semantics are fundamentally broken.
> >>>
> >>> Good point.
> >>>
> >>> I can imagine at least three ways that fork() could continue working, though:
> >>>
> >>> 1. Allow lookups of immediate children, too.  (I don't love this one.)
> >>> 2. Allow non-self pids to be translated in but not out.  This way
> >>> P_ALL will continue working.
> >>> 3. Have the kernel treat any PID-restricted process as though it were NOCLDWAIT.
> >>>
> >>> I think I like #3.  Thoughts?
> >>>
> >>>>
> >>>> If fork is not allowed pid namespaces already solve this problem.
> >>>
> >>> PID namespaces are fairly heavyweight.  Julien pointed out that using
> >>> PID namespaces requires a bunch of dummy PID 1 processes.
> >>
> >> Only if you can't tolerate init exiting.  The reasoning with respect to
> >> signals and signals being ignored was wrong.  And if you only have one
> >> process you care about and no children to worry about neither the
> >> difference in signal handling nor the world dies whe init exits applies.
> >
> > Can you elaborate?  It seems entirely plausible to me that there are
> > programs that won't work right as PID 1 without considerable
> > adaptation.
>
> The only funny things about pid 1 of a pid namespace are:
> - children can't send signals to pid 1 unless a signal handler has
>   been established.
> - All children die when the parent dies.
> - Grand children become zombies of the parent when the children die.
> - The pid is 1.
>
> That is almost everything is the same and it takes almost no adaptation
> (really) to run as the initial pid in a pid namespace.
>
> Not being able to receive signals (which is the argument I read against
> them) is bogus.  You just have to set your signal handler to something
> besides SIG_DFL.
>
> So I have my question:  What is the use case people are trying to solve
> by filtering signals and pid lookups.  If children are not part of the
> goal a pid namespace will work just fine.
>
> >> Therefore given what I have read described pid namespaces are a trivial
> >> solution to this problem space.
> >
> > pid namespaces also won't work in the context of Capsicum unless you
> > want every single Capsicum process to be its own pid namespace.
>
> For a tightly bound process I don't see why each process could not be
> it's own pid namespace.

Two main reasons: You can't put yourself in a pid namespace, so you
need to fork into your sandbox, and you can't prevent yourself from
seeing your children (although, as noted, my approach has issues here,
too, but I think this is more easily solved outside the context of
namespaces).

>
> > Also,
> > pid namespaces don't offer any way to protect children from parents.
>
> And my presumption was that there were not any children because the
> semantics suggested so far do not properly support children.
>

I'd like to try to fix that.

Another approach: let waiting for zombies that are immediate children
be an exception.

--Andy

> Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/