lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 May 2021 20:18:25 +0100
From:   "Amanieu d'Antras" <amanieu@...il.com>
To:     Steven Price <steven.price@....com>
Cc:     Arnd Bergmann <arnd@...nel.org>,
        Ryan Houdek <Houdek.Ryan@...-emu.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        David Laight <David.Laight@...lab.com>,
        Mark Brown <broonie@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RESEND PATCH v4 8/8] arm64: Allow 64-bit tasks to invoke compat syscalls

On Fri, May 21, 2021 at 9:51 AM Steven Price <steven.price@....com> wrote:
> >> In those cases to correctly emulate seccomp, isn't Tango is going to
> >> have to implement the seccomp filter in user space?
> >
> > I have not implemented user-mode seccomp emulation because it can
> > trivially be bypassed by spawning a 64-bit child process which runs
> > outside Tango. Even when spawning another translated process, the
> > user-mode filter will not be preserved across an execve.
>
> Clearly if you have user-mode seccomp emulation then you'd hook execve
> and either install the real BPF filter (if spawning a 64 bit child
> outside Tango) or ensure that the user-mode emulation is passed on to
> the child (if running within Tango).

Spawning another process is just an example. Fundamentally, Tango is
not intended or designed to be a sandbox around the 32-bit code. For
example, many of the newer ioctls use u64 instead of a pointer type to
avoid the need for a compat_ioctl handler. This means that such ioctls
could be abused to read/write any address in the process address
space, including the code that is performing the usermode seccomp
emulation.

> You already have a potential 'issue' here of a 64 bit process setting up
> a seccomp filter and then execve()ing a 32 bit (Tango'd) process. The
> set of syscalls needed for the system which supports AArch32 natively is
> going to be different from the syscalls needed for Tango. (Fundamentally
> this is a major limitation with the whole seccomp syscall filtering
> approach).

The specific example I had in mind here is Android which installs a
global seccomp filter on the zygote process from which app processes
are forked from. This filter is designed for mixed arm32/arm64 systems
and therefore has syscall whitelists for both AArch32 and AArch64.
This filter allows 32-bit processes to spawn 64-bit processes and
vice-versa: for example, many 32-bit apps will invoke another 32-bit
executable via system() which uses a 64-bit /system/bin/sh.

> >> I guess the question comes down to how big a hole is
> >> syscall_in_tango_whitelist() - if Tango only requires a small set of
> >> syscalls then there is still some security benefit, but otherwise this
> >> doesn't seem like a particularly big benefit considering you're already
> >> going to need the BPF infrastructure in user space.
> >
> > Currently Tango only whitelists ~50 syscalls, which is small enough to
> > provide security benefits and definitely better than not supporting
> > seccomp at all.
>
> Agreed, and I don't want to imply that this approach is necessarily
> wrong. But given that the approach of getting the kernel to do the
> compat syscall filtering is not perfect, I'm not sure in itself it's a
> great justification for needing the kernel to support all the compat
> syscalls.

I feel that exposing all compat syscalls to 64-bit processes is better
than the alternative of only exposing a subset of them. Of the top of
my head I can think of quite a few compat syscalls that cannot be
fully emulated in userspace and would need to be exposed in the
kernel:
- mmap/mremap/shmat/io_setup: anything that allocates VM space needs
to return a pointer in the low 4GB.
- ioctl: too many variants to reasonably maintain a separate compat
layer in userspace.
- getdents/lseek: ext4 uses 32-bit directory offsets for 32-bit processes.
- get_robust_list/set_robust_list: different in-memory ABI for
32/64-bit processes.
- open: don't force O_LARGEFILE for 32-bit processes.
- io_uring_create: different in-memory ABI for 32/64-bit processes.
- (and possibly many others)

Also consider the churn involved when adding a new syscall which
behaves differently in compat processes: rather than just using
in_compat_syscall() or wiring up a COMPAT_SYSCALL_DEFINE, a compat
variant of this syscall would also need to be added to the 64-bit
syscall table to support translation layers like Tango and FEX.

> One other thought: I suspect in practise there aren't actually many
> variations in the BPF programs used with seccomp. It may well be quite
> possible to convert the 32-bit syscall filtering programs to filter the
> equivalent 64-bit syscalls that Tango would use. Sadly this would be
> fragile if a program used a BPF program which didn't follow the "normal"
> pattern.

This might work for simple filters that only look at the syscall
number, but becomes much harder when the filter also inspects the
syscall arguments.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ