[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160711182654.GA19160@redhat.com>
Date: Mon, 11 Jul 2016 20:26:54 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Dmitry Safonov <dsafonov@...tuozzo.com>,
Michal Hocko <mhocko@...e.com>,
Vladimir Davydov <vdavydov@...tuozzo.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Ingo Molnar <mingo@...hat.com>,
Cyrill Gorcunov <gorcunov@...nvz.org>, xemul@...tuozzo.com,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>
Subject: Re: [PATCHv2 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
On 07/10, Andy Lutomirski wrote:
>
> On Thu, Jul 7, 2016 at 4:11 AM, Dmitry Safonov <dsafonov@...tuozzo.com> wrote:
> > On 07/06/2016 05:30 PM, Andy Lutomirski wrote:
> >>
> >> On Wed, Jun 29, 2016 at 3:57 AM, Dmitry Safonov <dsafonov@...tuozzo.com>
> >> wrote:
> >>>
> >>> Add API to change vdso blob type with arch_prctl.
> >>> As this is usefull only by needs of CRIU, expose
> >>> this interface under CONFIG_CHECKPOINT_RESTORE.
> >>
> >>
> >>> +#ifdef CONFIG_CHECKPOINT_RESTORE
> >>> + case ARCH_MAP_VDSO_X32:
> >>> + return do_map_vdso(VDSO_X32, addr, false);
> >>> + case ARCH_MAP_VDSO_32:
> >>> + return do_map_vdso(VDSO_32, addr, false);
> >>> + case ARCH_MAP_VDSO_64:
> >>> + return do_map_vdso(VDSO_64, addr, false);
> >>> +#endif
> >>> +
> >>
> >>
> >> This will have an odd side effect: if the old mapping is still around,
> >> its .fault will start behaving erratically.
Yes but I am not sure I fully understand your concerns, so let me ask...
Do we really care? I mean, the kernel can't crash or something like this,
just the old vdso mapping can faultin the "wrong" page from the new
vdso_image, right?
The user of prctl(ARCH_MAP_VDSO) should understand what it does and unmap
the old vdso anyway.
> >> I wonder if we can either
> >> reliably zap the old vma (or check that it's not there any more)
> >> before mapping a new one
However, I think this is right anyway, please see below...
> >> or whether we can associate the vdso image
> >> with the vma (possibly by having a separate vm_special_mapping for
> >> each vdso_image.
Yes, I too thought it would be nice to do this, regardless.
But as you said we probably want to limit the numbet of special mappings
an application can create:
> >> I'm also a bit concerned that __install_special_mapping might not get
> >> all the cgroup and rlimit stuff right. If we ensure that any old
> >> mappings are gone, then the damage is bounded, but otherwise someone
> >> might call this in a loop and fill their address space with arbitrary
> >> numbers of special mappings.
I think you are right, we should not allow user-space to abuse the special
mappings. Even if iiuc in this case only RLIMIT_AS does matter...
> Oleg, want to sanity-check us? Do you believe that if .mremap ensures
> that only entire vma can be remapped
Yes I think this makes sense. And damn we should kill arch_remap() ;)
> and .close ensures that only the
> whole vma can be unmapped,
How? It can't return the error.
And do_munmap() doesn't necessarily call ->close(),
> Or will we have issues with
> mprotect?
Yes, __split_vma() doesn't call ->close() too. ->open() can't help...
So it seems that we should do this by hand somehow. But in fact, what
I actually think right now is that I am totally confused and got lost ;)
Oleg.
Powered by blists - more mailing lists