lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8737yomrwm.fsf@x220.int.ebiederm.org>
Date:	Tue, 08 Sep 2015 18:07:37 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Andy Lutomirski <luto@...capital.net>
Cc:	David Drysdale <drysdale@...gle.com>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
	"Serge E. Hallyn" <serge@...lyn.com>
Subject: Re: RFC: fsyscall

Andy Lutomirski <luto@...capital.net> writes:

> On Tue, Sep 8, 2015 at 3:35 PM, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>
>> I was thinking a bit about the problem of allowing another process to
>> perform a subset of what your process can perform, and it occured to me
>> there might be something conceptually simple we can do.
>>
>> Have a system call fsyscall that takes a file descriptor the system call
>> number and the parameters to that system call as arguments.  AKA
>> long fsyscall(int fd, long number, ...); AKA syscall with a file
>> desciptor argument.
>>
>> The fd would hold a struct cred, and a filter that limits what system
>> calls and which parameters may be passed.
>>
>> The implementation of fsyscall would be something like:
>>         old = override_creds(f->f_cred);
>>         /* Perform filtered syscallf */
>>         revert_creds(old);
>>
>> Then we have another system call call it fsyscall_create(...) that takes
>> a bpf filter and returns a file descriptor, that can be used with
>> fsyscall.
>>
>> I'm not certain that bpf is the best way to create such a filter but it
>> seems plausible, and we already have the infrastructure in place, so if
>> nothing else there would be synergy in syscall filtering.
>>
>> My two concerns with bpf are (a) it seems a little complex for the
>> simplest use cases.  (b) I think there cases like inspecting the data
>> passed into write, or send, or the structure passed into ioctl that it
>> doesn't handle well yet.
>>
>> Andy does a fsyscall system call sound like something that would be not
>> be too bad to implement?  (You have just been through all of the x86
>> system call paths recently).
>
> It's not possible yet due to nasty calling convention issues.
> (Entries in the x86 syscall table aren't actually functions callable
> using the C ABI right now.)  My pending monster patchset will make it
> possible to implement for 32-bit syscalls (native and compat).  I'm
> planning on addressing 64-bit, and I want to do almost the reverse of
> what you're proposing: have a way that one task can trap into a
> special mode in which another process can do syscalls on its behalf.

Hmm.  That seems comparatively dangerous to me.

> There are some syscalls for which this simply makes no sense.
> Setresuid, capset, and similar come to mind.  Clone and friends may
> screw up impressively if you try this.  fsyscall should not be allowed
> to call itself.  If you call write(2) like this and it has any
> meaningful effect, something's wrong.

If you peak into the data that is being written it can be meaningful on
write(2).

Hmm.  But yes for file descriptor based system calls this is much less
interesting.  Having some kind of wrapper that embeds one file
descriptor in another and does the filtering that way seems more
interesting, for the file descriptor based methods.

>   keyctl(2) does really awful
> things wrt struct cred, and I don't really want to think about what
> happens if you try calling it like this.
>
> override_creds is IMO awful.  Serge and I had an old discussion on how
> to maybe fix it.
>
> Honestly, I think the way to go might be to get Capsicum, or at least
> Capsicum's fd model, merged and to add a mode in which the *at
> operations on a specially marked fd use the passed fd's f_cred instead
> of the caller's.  (Cc: David Drysdale -- that feature might be really
> nice.)

Perhaps I had missed it but I don't recall capsicum being able to wrap
things like reboot(2).

Which really describes what I am trying to tackle.  How do we create an
object that we can pass between processes that limits what we can do in
the case of the oddball syscalls that require special privileges.

At the same time I still want the caller to be able to pass in data to
the system calls being called such as REBOOT_CMD_POWER_OFF versus
REBOOT_CMD_HALT, while being able to filter it and say you may not pass
REBOOT_CMD_CAD_OFF.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ