linux-kernel - Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LRH.2.00.1105261034200.29690@tundra.namei.org>
Date:	Thu, 26 May 2011 11:19:34 +1000 (EST)
From:	James Morris <jmorris@...ei.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	Kees Cook <kees.cook@...onical.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Will Drewry <wad@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, Avi Kivity <avi@...hat.com>,
	gnatapov@...hat.com, Chris Wright <chrisw@...s-sol.org>
Subject: Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call
 filtering

On Wed, 25 May 2011, Linus Torvalds wrote:

> And per-system-call permissions are very dubious. What system calls
> don't you want to succeed? That ioctl? You just made it impossible to
> do a modern graphical application. Yet the kind of thing where we
> would _want_ to help users is in making it easier to sandbox something
> like the adobe flash player. But without accelerated direct rendering,
> that's not going to fly, is it?

Going back to the initial idea proposed by Will, where seccomp is simply 
extended to filter all syscalls, there is potential benefit in being able 
to limit the attack surface of the syscall API.

This is not security mediation in terms of interaction between things 
(e.g. "allow A to read B").  It's a _hardening_ feature which prevents a 
process from being able to invoke potentially hundreds of syscalls is has 
no need for.  It would allow us to usefully restrict some well-established 
attack modes, e.g. triggering bugs in kernel code via unneeded syscalls.

This is orthogonal to access control schemes (such as SELinux), which are 
about mediating security-relevant interactions between objects.

One area of possible use is KVM/Qemu, where processes now contain entire 
operating systems, and the attack surface between them is now much broader 
e.g. a local unprivileged vulnerability is now effectively a 'remote' full 
system compromise.

There has been some discussion of this within the KVM project.  Using the 
existing seccomp facility is problematic in that it requires significant 
reworking of Qemu to a privsep model, which would also then incur a likely 
unacceptable context switching overhead.  The generalized seccomp filter 
as proposed by Will would provide a significant reduction in exposed 
syscalls and thus guest->host attack surface.

I've cc'd some KVM folk for more input on how this may or may not meet 
their requirements -- Avi/Gleb, there's a background writeup here: 
http://lwn.net/Articles/442569/ .  We may need a proof of concept and/or 
commitment to use this feature for it to be accepted upstream.

- James
-- 
James Morris
<jmorris@...ei.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/