lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X/jr8QfeolQwn39f@redhat.com>
Date:   Fri, 8 Jan 2021 18:34:09 -0500
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Jason Gunthorpe <jgg@...pe.ca>, Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, Yu Zhao <yuzhao@...gle.com>,
        Peter Xu <peterx@...hat.com>,
        Pavel Emelyanov <xemul@...nvz.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Minchan Kim <minchan@...nel.org>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        John Hubbard <jhubbard@...dia.com>,
        Leon Romanovsky <leonro@...dia.com>, Jan Kara <jack@...e.cz>,
        Kirill Tkhai <ktkhai@...tuozzo.com>
Subject: Re: [PATCH 0/2] page_count can't be used to decide when wp_page_copy

On Fri, Jan 08, 2021 at 10:31:24AM -0800, Andy Lutomirski wrote:
> Can we just remove vmsplice() support?  We could make it do a normal

The single case I've seen vmsplice used so far, that was really cool
is localhost live migration of qemu. However despite really cool, it
wasn't merged in the end, and I don't recall exactly why.

There are even more efficient (but slightly more complex) ways to do
that than vmsplice: using MAP_SHARED gigapages or MAP_SHARED tmpfs
with THP opted-in in the tmpfs mount, as guest physical memory instead
of anon memory and finding a way not having it cleared by kexec, so
you can also upgrade the host kernel and not just qemu... is a way
more optimal way to PIN and move all pages through the pipe and still
having to pay a superfluous copy on destination.

My guess why it's not popular, and I may be completely wrong on this
since I basically never used vmsplice (other than to proof of concept
DoS my phone to verify the long term GUP pin exploit works), is that
vmsplice is a more efficient, but not the most efficient option.

Exactly like in the live migration in place, it's always more
efficient to share a tmpfs THP backed region and have true zero copy,
than going through a pipe that still does one copy at the receiving
end. It may also be simpler and it's not dependent on F_SETPIPE_SIZE
obscure tunings. So in the end it's still too slow for apps that
requires maximum performance, and not worth the extra work for those
that don't.

I love vmsplice conceptually, just I'd rather prefer an luser cannot
run it.

> copy, thereby getting rid of a fair amount of nastiness and potential
> attacks.  Even ignoring issues relating to the length of time that the
> vmsplice reference is alive, we also have whatever problems could be
> caused by a malicious or misguided user vmsplice()ing some memory and
> then modifying it.

Sorry to ask but I'm curious, what also goes wrong if the user
modifies memory under GUP pin from vmsplice? That's not obvious to
see.

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ