[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170807134648.GI32434@dhcp22.suse.cz>
Date: Mon, 7 Aug 2017 15:46:48 +0200
From: Michal Hocko <mhocko@...nel.org>
To: riel@...hat.com
Cc: linux-kernel@...r.kernel.org, mike.kravetz@...cle.com,
linux-mm@...ck.org, fweimer@...hat.com, colm@...costs.net,
akpm@...ux-foundation.org, keescook@...omium.org,
luto@...capital.net, wad@...omium.org, mingo@...nel.org,
kirill@...temov.name, dave.hansen@...el.com,
linux-api@...r.kernel.org
Subject: Re: [PATCH v2 0/2] mm,fork,security: introduce MADV_WIPEONFORK
On Mon 07-08-17 15:22:57, Michal Hocko wrote:
> This is an user visible API so make sure you CC linux-api (added)
>
> On Sun 06-08-17 10:04:23, Rik van Riel wrote:
> > v2: fix MAP_SHARED case and kbuild warnings
> >
> > Introduce MADV_WIPEONFORK semantics, which result in a VMA being
> > empty in the child process after fork. This differs from MADV_DONTFORK
> > in one important way.
> >
> > If a child process accesses memory that was MADV_WIPEONFORK, it
> > will get zeroes. The address ranges are still valid, they are just empty.
> >
> > If a child process accesses memory that was MADV_DONTFORK, it will
> > get a segmentation fault, since those address ranges are no longer
> > valid in the child after fork.
> >
> > Since MADV_DONTFORK also seems to be used to allow very large
> > programs to fork in systems with strict memory overcommit restrictions,
> > changing the semantics of MADV_DONTFORK might break existing programs.
> >
> > The use case is libraries that store or cache information, and
> > want to know that they need to regenerate it in the child process
> > after fork.
How do they know that they need to regenerate if they do not get SEGV?
Are they going to assume that a read of zeros is a "must init again"? Isn't
that too fragile? Or do they play other tricks like parse /proc/self/smaps
and read in the flag?
> > Examples of this would be:
> > - systemd/pulseaudio API checks (fail after fork)
> > (replacing a getpid check, which is too slow without a PID cache)
> > - PKCS#11 API reinitialization check (mandated by specification)
> > - glibc's upcoming PRNG (reseed after fork)
> > - OpenSSL PRNG (reseed after fork)
> >
> > The security benefits of a forking server having a re-inialized
> > PRNG in every child process are pretty obvious. However, due to
> > libraries having all kinds of internal state, and programs getting
> > compiled with many different versions of each library, it is
> > unreasonable to expect calling programs to re-initialize everything
> > manually after fork.
> >
> > A further complication is the proliferation of clone flags,
> > programs bypassing glibc's functions to call clone directly,
> > and programs calling unshare, causing the glibc pthread_atfork
> > hook to not get called.
> >
> > It would be better to have the kernel take care of this automatically.
> >
> > This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
> >
> > https://man.openbsd.org/minherit.2
I would argue that a MAP_$FOO flag would be more appropriate. Or do you
see any cases where such a special mapping would need to change the
semantic and inherit the content over the fork again?
I do not like the madvise because it is an advise and as such it can be
ignored/not implemented and that shouldn't have any correctness effects
on the child process.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists