[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200730153450.GH23808@casper.infradead.org>
Date: Thu, 30 Jul 2020 16:34:50 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: Anthony Yznaga <anthony.yznaga@...cle.com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-arch@...r.kernel.org, mhocko@...nel.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, x86@...nel.org,
hpa@...or.com, viro@...iv.linux.org.uk, akpm@...ux-foundation.org,
arnd@...db.de, ebiederm@...ssion.com, keescook@...omium.org,
gerg@...ux-m68k.org, ktkhai@...tuozzo.com, peterz@...radead.org,
esyr@...hat.com, jgg@...pe.ca, christian@...lner.me,
areber@...hat.com, cyphar@...har.com, steven.sistare@...cle.com
Subject: Re: [RFC PATCH 0/5] madvise MADV_DOEXEC
On Thu, Jul 30, 2020 at 05:27:05PM +0200, Christian Brauner wrote:
> On Thu, Jul 30, 2020 at 04:22:50PM +0100, Matthew Wilcox wrote:
> > On Mon, Jul 27, 2020 at 10:11:22AM -0700, Anthony Yznaga wrote:
> > > This patchset adds support for preserving an anonymous memory range across
> > > exec(3) using a new madvise MADV_DOEXEC argument. The primary benefit for
> > > sharing memory in this manner, as opposed to re-attaching to a named shared
> > > memory segment, is to ensure it is mapped at the same virtual address in
> > > the new process as it was in the old one. An intended use for this is to
> > > preserve guest memory for guests using vfio while qemu exec's an updated
> > > version of itself. By ensuring the memory is preserved at a fixed address,
> > > vfio mappings and their associated kernel data structures can remain valid.
> > > In addition, for the qemu use case, qemu instances that back guest RAM with
> > > anonymous memory can be updated.
> >
> > I just realised that something else I'm working on might be a suitable
> > alternative to this. Apologies for not realising it sooner.
> >
> > http://www.wil.cx/~willy/linux/sileby.html
>
> Just skimming: make it O_CLOEXEC by default. ;)
I appreciate the suggestion, and it makes sense for many 'return an fd'
interfaces, but the point of mshare() is to, well, share. So sharing
the fd with a child is a common usecase, unlike say sharing a timerfd.
The only other reason to use mshare() is to pass the fd over a unix
socket to a non-child, and I submit that is far less common than wanting
to share with a child.
Powered by blists - more mailing lists