[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080129113646.GA76@tv-sign.ru>
Date: Tue, 29 Jan 2008 14:36:46 +0300
From: Oleg Nesterov <oleg@...sign.ru>
To: Matt Helsley <matthltc@...ibm.com>
Cc: akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
dhowells@...hat.com, ebiederm@...ssion.com, hch@....de,
viro@...iv.linux.org.uk
Subject: Re: + fix-procfs-task-exe-symlink.patch added to -mm tree
s/mm-commits/lkml/
On 01/28, Matt Helsley wrote:
>
> On Sun, 2008-01-27 at 19:25 +0300, Oleg Nesterov wrote:
> >
> > Why? Linux doesn't allow sys_mmap(MAP_EXECUTABLE), so any VM_EXECUTABLE vma
> > should refer to the same bprm->file which was mapped by elf_map(), no?
>
> You're right. So in this case the kernel wouldn't find any VMA marked
> VM_EXECUTABLE and it would return with -ENOENT. The same would happen
> with this patch since we drop the extra file reference once the
> VM_EXECTUABLE VMAs disappear.
OK, thanks. This leads to another question which I forgot to ask.
This patch has a lot of complications because it tries to preserve the
current behaviour: we release the bprm->file when all VM_EXECTUABLE vmas
are unmapped. Q: is this so important/useful? I don't think this is very
common case, and I don't quite understand why it is critical to release
the file. To unmount fs after starting the app? One can always copy
the file before execing, or do "/lib/ld-linus.so application" and then
unmap the vmas.
(I am not arguing, just curious).
> > I don't understand why do we need ->exe_file_lock. Afaics, all callers of
> > added_exe_file_vma/removed_exe_file_vma must hold ->mmap_sem, yes? But this
> > means get_mm_exe_file() can use down_read(mm->map_sem). No?
>
> Yes, I could get the task's ->mmap_sem there too and reuse the mmap_sem
> rather than add a lock. That allows nearly any task to grab another
> tasks mmap_sem simply by doing a readlink on /proc/pid/exe. So I thought
> avoiding reuse of the mmap_sem might be best.
>
> Do you still think it would be better to reuse mmap_sem?
Well, we only need down_read(mmap_sem) for the very short time. /proc/pid/maps
is much "worse" in this sense.
> > > @@ -409,6 +410,7 @@ void mmput(struct mm_struct *mm)
> > > if (atomic_dec_and_test(&mm->mm_users)) {
> > > exit_aio(mm);
> > > exit_mmap(mm);
> > > + set_mm_exe_file(mm, NULL);
> >
> > This change looks unneeded. exit_mmap() removes all vmas. The last VM_EXECUTABLE
> > vma should clear ->exe_file via removed_exe_file_vma() ?
>
> You're right -- it's redundant. I'll fix that.
Sorry, I was wrong.
mmput() has to release ->exe_file if it is called when exec fails before the
first do_mmmap(MAP_EXECUTABLE). This also means that it is not completely
trivial to set ->exe_file before exec_mmap(), it can fail. This is solvable,
but I'm not sure we should do this.
Still, the accounting looks a little bit fragile to me. flush_old_exec()
increments ->f_count but sets ->num_exe_file_vmas = 0 because we know that
the next elf_map() will bump ->num_exe_file_vmas and thus "sync" 2 counters.
But I don't see how to do better if we really want to release the file when
VM_EXECUTABLE disappears.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists