linux-hardening - Re: [QUESTION] Full user space process isolation?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <17702e7f-479a-22b8-70d9-56e418c8120b@huawei.com>
Date:   Tue, 4 Jul 2023 17:18:43 +0200
From:   Petr Tesarik <petr.tesarik.ext@...wei.com>
To:     Roberto Sassu <roberto.sassu@...weicloud.com>,
        Jann Horn <jannh@...gle.com>
CC:     Oleg Nesterov <oleg@...hat.com>, Paul Moore <paul@...l-moore.com>,
        James Morris <jmorris@...ei.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        Stephen Smalley <stephen.smalley.work@...il.com>,
        Eric Paris <eparis@...isplace.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mimi Zohar <zohar@...ux.ibm.com>,
        Kees Cook <keescook@...omium.org>,
        Casey Schaufler <casey@...aufler-ca.com>,
        David Howells <dhowells@...hat.com>,
        LuisChamberlain <mcgrof@...nel.org>,
        Eric Biederman <ebiederm@...ssion.com>,
        Christoph Hellwig <hch@...radead.org>,
        Petr Mladek <pmladek@...e.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tejun Heo <tj@...nel.org>, <linux-mm@...ck.org>,
        <linux-security-module@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <keyrings@...r.kernel.org>,
        <linux-integrity@...r.kernel.org>,
        <linux-hardening@...r.kernel.org>
Subject: Re: [QUESTION] Full user space process isolation?

On 7/3/2023 5:28 PM, Roberto Sassu wrote:
> On Mon, 2023-07-03 at 17:06 +0200, Jann Horn wrote:
>> On Thu, Jun 22, 2023 at 4:45 PM Roberto Sassu
>> <roberto.sassu@...weicloud.com> wrote:
>>> I wanted to execute some kernel workloads in a fully isolated user
>>> space process, started from a binary statically linked with klibc,
>>> connected to the kernel only through a pipe.
>>
>> FWIW, the kernel has some infrastructure for this already, see
>> CONFIG_USERMODE_DRIVER and kernel/usermode_driver.c, with a usage
>> example in net/bpfilter/.
> 
> Thanks, I actually took that code to make a generic UMD management
> library, that can be used by all use cases:
> 
> https://lore.kernel.org/linux-kernel/20230317145240.363908-1-roberto.sassu@huaweicloud.com/
> 
>>> I also wanted that, for the root user, tampering with that process is
>>> as hard as if the same code runs in kernel space.
>>
>> I believe that actually making it that hard would probably mean that
>> you'd have to ensure that the process doesn't use swap (in other
>> words, it would have to run with all memory locked), because root can
>> choose where swapped pages are stored. Other than that, if you mark it
>> as a kthread so that no ptrace access is allowed, you can probably get
>> pretty close. But if you do anything like that, please leave some way
>> (like a kernel build config option or such) to enable debugging for
>> these processes.
> 
> I didn't think about the swapping part... thanks!
> 
> Ok to enable debugging with a config option.
> 
>> But I'm not convinced that it makes sense to try to draw a security
>> boundary between fully-privileged root (with the ability to mount
>> things and configure swap and so on) and the kernel - my understanding
>> is that some kernel subsystems don't treat root-to-kernel privilege
>> escalation issues as security bugs that have to be fixed.
> 
> Yes, that is unfortunately true, and in that case the trustworthy UMD
> would not make things worse. On the other hand, on systems where that
> separation is defined, the advantage would be to run more exploitable
> code in user space, leaving the kernel safe.
> 
> I'm thinking about all the cases where the code had to be included in
> the kernel to run at the same privilege level, but would not use any of
> the kernel facilities (e.g. parsers).

Thanks for reminding me of kexec-tools. The complete image for booting a
new kernel was originally prepared in user space. With kernel lockdown,
all this code had to move into the kernel, adding a new syscall and lots
of complexity to build purgatory code, etc. Yet, this new implementation
in the kernel does not offer all features of kexec-tools, so both code
bases continue to exist and are happily diverging...

> If the boundary is extended to user space, some of these components
> could be moved away from the kernel, and the functionality would be the
> same without decreasing the security.

All right, AFAICS your idea is limited to relatively simple cases for
now. I mean, allowing kexec-tools to run in user space is not easily
possible when UID 0 is not trusted, because kexec needs to open various
files and make various other syscalls, which would require a complex LSM
policy. It looks technically possible to write one, but then the big
question is if it would be simpler to review and maintain than adding
more kexec-tools features to the kernel.

Anyway, I can sense a general desire to run less code in the most
privileged system environment. Robert's proposal is one of few that go
in this direction. What are the alternatives?

Petr T