[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190516134220.GB24860@angband.pl>
Date: Thu, 16 May 2019 15:42:20 +0200
From: Adam Borowski <kilobyte@...band.pl>
To: Kirill Tkhai <ktkhai@...tuozzo.com>
Cc: akpm@...ux-foundation.org, dan.j.williams@...el.com,
mhocko@...e.com, keith.busch@...el.com,
kirill.shutemov@...ux.intel.com, pasha.tatashin@...cle.com,
alexander.h.duyck@...ux.intel.com, ira.weiny@...el.com,
andreyknvl@...gle.com, arunks@...eaurora.org, vbabka@...e.cz,
cl@...ux.com, riel@...riel.com, keescook@...omium.org,
hannes@...xchg.org, npiggin@...il.com,
mathieu.desnoyers@...icios.com, shakeelb@...gle.com, guro@...com,
aarcange@...hat.com, hughd@...gle.com, jglisse@...hat.com,
mgorman@...hsingularity.net, daniel.m.jordan@...cle.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH RFC 0/5] mm: process_vm_mmap() -- syscall for duplication
a process mapping
On Thu, May 16, 2019 at 04:10:07PM +0300, Kirill Tkhai wrote:
> On 15.05.2019 22:38, Adam Borowski wrote:
> > On Wed, May 15, 2019 at 06:11:15PM +0300, Kirill Tkhai wrote:
> >> This patchset adds a new syscall, which makes possible
> >> to clone a mapping from a process to another process.
> >> The syscall supplements the functionality provided
> >> by process_vm_writev() and process_vm_readv() syscalls,
> >> and it may be useful in many situation.
> >>
> >> For example, it allows to make a zero copy of data,
> >> when process_vm_writev() was previously used:
> >
> > I wonder, why not optimize the existing interfaces to do zero copy if
> > properly aligned? No need for a new syscall, and old code would immediately
> > benefit.
>
> Because, this is just not possible. You can't zero copy anonymous pages
> of a process to pages of a remote process, when they are different pages.
fork() manages that, and so does KSM. Like KSM, you want to make a page
shared -- you just skip the comparison step as you want to overwrite the old
contents.
And there's no need to touch the page, as fork() manages that fine no matter
if the page is resident, anonymous in swap, or file-backed, all without
reading from swap.
> >> There are several problems with process_vm_writev() in this example:
> >>
> >> 1)it causes pagefault on remote process memory, and it forces
> >> allocation of a new page (if was not preallocated);
> >>
> >> 2)amount of memory for this example is doubled in a moment --
> >> n pages in current and n pages in remote tasks are occupied
> >> at the same time;
> >>
> >> 3)received data has no a chance to be properly swapped for
> >> a long time.
> >
> > That'll handle all of your above problems, except for making pages
> > subject to CoW if written to. But if making pages writeably shared is
> > desired, the old functions have a "flags" argument that doesn't yet have a
> > single bit defined.
Meow!
--
⢀⣴⠾⠻⢶⣦⠀ Latin: meow 4 characters, 4 columns, 4 bytes
⣾⠁⢠⠒⠀⣿⡁ Greek: μεου 4 characters, 4 columns, 8 bytes
⢿⡄⠘⠷⠚⠋ Runes: ᛗᛖᛟᚹ 4 characters, 4 columns, 12 bytes
⠈⠳⣄⠀⠀⠀⠀ Chinese: 喵 1 character, 2 columns, 3 bytes <-- best!
Powered by blists - more mailing lists