[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87r29jaoov.fsf@oldenburg2.str.redhat.com>
Date: Tue, 30 Apr 2019 10:21:20 +0200
From: Florian Weimer <fweimer@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Jann Horn <jannh@...gle.com>, Kevin Easton <kevin@...rana.org>,
Andy Lutomirski <luto@...nel.org>,
Christian Brauner <christian@...uner.io>,
Aleksa Sarai <cyphar@...har.com>,
"Enrico Weigelt\, metux IT consult" <lkml@...ux.net>,
Al Viro <viro@...iv.linux.org.uk>,
David Howells <dhowells@...hat.com>,
Linux API <linux-api@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
"Serge E. Hallyn" <serge@...lyn.com>,
Arnd Bergmann <arnd@...db.de>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Kees Cook <keescook@...omium.org>,
Thomas Gleixner <tglx@...utronix.de>,
Michael Kerrisk <mtk.manpages@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Joel Fernandes <joel@...lfernandes.org>,
Daniel Colascione <dancol@...gle.com>
Subject: Re: RFC: on adding new CLONE_* flags [WAS Re: [PATCH 0/4] clone: add CLONE_PIDFD]
* Linus Torvalds:
> Note that vfork() is "exciting" for the compiler in much the same way
> "setjmp/longjmp()" is, because of the shared stack use in the child
> and the parent. It is *very* easy to get this wrong and cause massive
> and subtle memory corruption issues because the parent returns to
> something that has been messed up by the child.
Just using a wrapper around vfork is enough for that, if the return
address is saved on the stack. It's surprising hard to write a test
case for that, but the corruption is definitely there.
> (In fact, if I recall correctly, the _reason_ we have an explicit
> 'vfork()' entry point rather than using clone() with magic parameters
> was that the lack of arguments meant that you didn't have to
> save/restore any registers in user space, which made the whole stack
> issue simpler. But it's been two decades, so my memory is bitrotting).
That's an interesting point. Using a callback-style interface avoids
that because you never need to restore the registers in the new
subprocess. It's still appropriate to use an assembler implementation,
I think, because it will be more obviously correct.
> Also, particularly if you have a big address space, vfork()+execve()
> can be quite a bit faster than fork()+execve(). Linux fork() is pretty
> efficient, but if you have gigabytes of VM space to copy, it's going
> to take time even if you do it fairly well.
vfork is also more benign from a memory accounting perspective. In some
environments, it's not possible to call fork from a large process
because the accounting assumes (conservatively) that the new process
will dirty a lot of its private memory.
Thanks,
Florian
Powered by blists - more mailing lists