[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110428151241.GD1798@nowhere>
Date: Thu, 28 Apr 2011 17:12:44 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Will Drewry <wad@...omium.org>
Cc: linux-kernel@...r.kernel.org, kees.cook@...onical.com,
eparis@...hat.com, agl@...omium.org, mingo@...e.hu,
jmorris@...ei.org, rostedt@...dmis.org,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, Michal Marek <mmarek@...e.cz>,
Oleg Nesterov <oleg@...hat.com>,
Roland McGrath <roland@...hat.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Jiri Slaby <jslaby@...e.cz>,
David Howells <dhowells@...hat.com>,
"Serge E. Hallyn" <serge@...lyn.com>
Subject: Re: [PATCH 3/7] seccomp_filter: Enable ftrace-based system call
filtering
On Wed, Apr 27, 2011 at 10:08:47PM -0500, Will Drewry wrote:
> This change adds a new seccomp mode based on the work by
> agl@...omium.org. This mode comes with a bitmask of NR_syscalls size and
> an optional linked list of seccomp_filter objects. When in mode 2, all
> system calls are first checked against the bitmask to determine if they
> are allowed or denied. If allowed, the list of filters is checked for
> the given syscall number. If all filter predicates for the system call
> match or the system call was allowed without restriction, the process
> continues. Otherwise, it is killed and a KERN_INFO notification is
> posted.
>
> The filter language itself is provided by the ftrace filter engine.
> Related patches tweak to the perf filter trace and free allow the calls
> to be shared. Filters inherit their understanding of types and arguments
> for each system call from the CONFIG_FTRACE_SYSCALLS subsystem which
> predefines this information in syscall_metadata associated enter_event
> (and exit_event) structures.
>
> The result is that a process may reduce its available interfaces to
> the kernel through prctl() without knowing the appropriate system call
> number a priori and with the flexibility of filtering based on
> register-stored arguments. (String checks suffer from TOCTOU issues and
> should be left to LSMs to provide policy for! Don't get greedy :)
>
> A sample filterset for a process that only needs to interact over stdin
> and stdout and exit cleanly is shown below:
> sys_read: fd == 0
> sys_write: fd == 1
> sys_exit_group: 1
>
> The filters may be specified once prior to entering the reduced access
> state:
> prctl(PR_SET_SECCOMP, 2, filters);
Instead of having such multiline filter definition with syscall
names prepended, it would be nicer to make the parsing simplier.
You could have either:
prctl(PR_SET_SECCOMP, mode);
/* Works only if we are in mode 2 */
prctl(PR_SET_SECCOMP_FILTER, syscall_nr, filter);
or:
/*
* If mode == 2, set the filter to syscall_nr
* Recall this for each syscall that need a filter.
* If a filter was previously set on the targeted syscall,
* it will be overwritten.
*/
prctl(PR_SET_SECCOMP, mode, syscall_nr, filter);
One can erase a previous filter by setting the new filter "1".
Also, instead of having a bitmap of syscall to accept. You could
simply set "0" as a filter to those you want to deactivate:
prctl(PR_SET_SECCOMP, 2, 1, 0); <- deactivate the syscall_nr 1
Hm?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists