[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24c47463f9b469bdc03e415d953d1ca926d83680.camel@xry111.site>
Date: Sun, 25 Feb 2024 15:32:23 +0800
From: Xi Ruoyao <xry111@...111.site>
To: Icenowy Zheng <uwu@...nowy.me>, Huacai Chen <chenhuacai@...nel.org>,
WANG Xuerui <kernel@...0n.name>
Cc: linux-api@...r.kernel.org, Arnd Bergmann <arnd@...db.de>, Christian
Brauner <brauner@...nel.org>, Kees Cook <keescook@...omium.org>, Xuefeng Li
<lixuefeng@...ngson.cn>, Jianmin Lv <lvjianmin@...ngson.cn>, Xiaotian Wu
<wuxiaotian@...ngson.cn>, WANG Rui <wangrui@...ngson.cn>, Miao Wang
<shankerwangmiao@...il.com>, "loongarch@...ts.linux.dev"
<loongarch@...ts.linux.dev>, linux-arch <linux-arch@...r.kernel.org>, Linux
Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Chromium sandbox on LoongArch and statx -- seccomp deep
argument inspection again?
On Sun, 2024-02-25 at 14:51 +0800, Icenowy Zheng wrote:
> > From my point of view, I prefer to "restore fstat", because we need
> > to
> > use the Chrome sandbox everyday (even though it hasn't been upstream
> > by now). But I also hope "seccomp deep argument inspection" can be
> > solved in the future.
>
> My idea is this problem needs syscalls to be designed with deep
> argument inspection in mind; syscalls before this should be considered
> as historical error and get fixed by resotring old syscalls.
I'd not consider fstat an error as using statx for fstat has a
performance impact (severe for some workflows), and Linus has concluded
"if the user wants fstat, give them fstat" for the performance issue:
https://sourceware.org/pipermail/libc-alpha/2023-September/151365.html
However we only want fstat (actually "newfstat" in fs/stat.c), and it
seems we don't want to resurrect newstat, newlstat, newfstatat, etc. (or
am I missing any benefit - performance or "just pleasing seccomp" - of
them comparing to statx?) so we don't want to just define
__ARCH_WANT_NEW_STAT. So it seems we need to add some new #if to
fs/stat.c and include/uapi/asm-generic/unistd.h.
And no, it's not a design issue of all other syscalls. It's just the
design issue of seccomp. There's no way to design a syscall allowing
seccomp to inspect a 100-character path in its argument unless
refactoring seccomp entirely because we cannot fit a 100-character path
into 8 registers.
As at now people do use PTRACE_PEEKDATA for "deep inspection" (actually
"debugging" the target process) but it obviously makes a very severe
performance impact.
<rant>
Today the entire software industry is saying "do things in a declarative
way" but seccomp is completely the opposite. It's auditing *how* the
sandboxed application is doing things instead of *what* will be done.
I've raised my against to seccomp and/or syscall allowlisting several
times after seeing so many breakages like:
- https://github.com/NetworkConfiguration/dhcpcd/issues/120
- https://gitlab.gnome.org/GNOME/tracker-miners/-/issues/252
- https://blog.pintia.cn/2018/06/27/glibc-segmentation-fault/
- http://web.archive.org/web/20210126121421/http://acm.xidian.edu.cn/discuss/thread.php?tid=148&cid=# (comment 3)
but people just keep telling me "you are wrong, you don't understand
security". Some of them even complain "seccomp is broken" as well but
still keep using it.
</rant>
--
Xi Ruoyao <xry111@...111.site>
School of Aerospace Science and Technology, Xidian University
Powered by blists - more mailing lists