lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c007e3e9-e915-16f3-de31-c811ad37c44c@gmail.com>
Date:   Sun, 31 May 2020 20:10:18 +0300
From:   Paul Gofman <gofmanp@...il.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Gabriel Krisman Bertazi <krisman@...labora.com>,
        Kees Cook <keescook@...omium.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel@...labora.com,
        Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...capital.net>,
        Will Drewry <wad@...omium.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        linux-security-module@...r.kernel.org,
        Zebediah Figura <zfigura@...eweavers.com>
Subject: Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory
 areas

On 5/31/20 19:49, Matthew Wilcox wrote:
> On Sun, May 31, 2020 at 03:39:33PM +0300, Paul Gofman wrote:
>>> Paul (cc'ed) is the wine expert, but my understanding is that memory
>>> allocation and initial program load of the emulated binary will go
>>> through wine.  It does the allocation and mark the vma accordingly
>>> before returning the allocated range to the windows application.
>> Yes, exactly. Pretty much any memory allocation which Wine does needs
>> syscalls (if those are ever encountered later during executing code from
>> those areas) to be trapped by Wine and passed to Wine's implementation
>> of the corresponding Windows API function. Linux native libraries
>> loading and memory allocations performed by them go outside of Wine control.
> I don't like Gabriel's approach very much.  Could we do something like
> issue a syscall before executing a Windows region and then issue another
> syscall when exiting?  If so, we could switch the syscall entry point (ie
> change MSR_LSTAR).  I'm thinking something like a personality() syscall.
> But maybe that would be too high an overhead.
>
IIRC Gabriel had such idea that we discussed. We can potentially track
the boundary between the Windows and native code exectution. But issuing
syscall every time we cross that boundary may have a prohibitive
performance impact, that happens way too often. What we could do is to
put the flag somewhere, but that flag has to be per thread. E. g., we
could use Linux gs: based thread local storage, or fs: based address
(that's what Windows using for thread local data and thus Wine maintains
also). If Seccomp filters could access such a memory location (fetch a
byte from there and put into the structure accessible by BPF_LD) we
could use SECCOMP_MODE_FILTER, I think.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ