[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110916183150.GA26226@shutemov.name>
Date: Fri, 16 Sep 2011 21:31:50 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: Cyrill Gorcunov <gorcunov@...il.com>
Cc: Vasiliy Kulikov <segoon@...nwall.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Pavel Machek <pavel@....cz>, Andrew Morton <akpm00@...il.com>,
linux-kernel@...r.kernel.org, containers@...ts.osdl.org,
linux-fsdevel@...r.kernel.org,
Pavel Emelyanov <xemul@...allels.com>,
James Bottomley <jbottomley@...allels.com>,
Nathan Lynch <ntl@...ox.com>, Zan Lynx <zlynx@....org>,
Daniel Lezcano <dlezcano@...ibm.com>,
Tejun Heo <tj@...nel.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Al Viro <viro@...IV.linux.org.uk>
Subject: Re: [patch 2/2] fs, proc: Introduce the /proc/<pid>/map_files/
directory v12
On Fri, Sep 16, 2011 at 10:26:54PM +0400, Cyrill Gorcunov wrote:
> On Fri, Sep 16, 2011 at 10:11:46PM +0400, Vasiliy Kulikov wrote:
> ...
> >
> > Yep, with CAP_SYS_ADMIN check there should be no issues here.
> >
> > Reviewed-by: Vasiliy Kulikov <segoon@...nwall.com>
> >
>
> Here we go. Andrew, should I resend the whole series (ie two patches?)
>
> Cyrill
> ---
> From: Pavel Emelyanov <xemul@...allels.com>
> Subject: [PATCH] fs, proc: Introduce the /proc/<pid>/map_files/ directory v14
>
> This one behaves similarly to the /proc/<pid>/fd/ one - it contains symlinks
> one for each mapping with file, the name of a symlink is "vma->vm_start-vma->vm_end",
> the target is the file. Opening a symlink results in a file that point exactly
> to the same inode as them vma's one.
>
> For example the ls -l of some arbitrary /proc/<pid>/map_files/
>
> | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
> | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
> | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
> | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
> | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so
>
> This *helps* checkpointing process in three ways:
>
> 1. When dumping a task mappings we do know exact file that is mapped by particular
> region. We do this by opening /proc/$pid/map_files/$address symlink the way we do
> with file descriptors.
>
> 2. This also helps in determining which anonymous shared mappings are shared with
> each other by comparing the inodes of them.
>
> 3. When restoring a set of processes in case two of them has a mapping shared, we map
> the memory by the 1st one and then open its /proc/$pid/map_files/$address file and
> map it by the 2nd task.
>
> Using /proc/$pid/maps for this is quite inconvenient since it brings repeatable
> re-reading and reparsing for this text file which slows down restore procedure
> significantly. Also as being pointed in (3) it is a way easier to use top level
> shared mapping in children as /proc/$pid/map_files/$address when needed.
>
> v2: (spotted by Tejun Heo)
> - /proc/<pid>/mfd changed to /proc/<pid>/map_files
> - find_vma helper is used instead of linear search
> - routines are re-grouped
> - d_revalidate is set now
>
> v3:
> - d_revalidate reworked, now it should drops no longer valid dentries (Tejun Heo)
> - ptrace_may_access added into proc_map_files_lookup (Vasiliy Kulikov)
> - because of filldir (which eventually might need to lock mmap_sem)
> the proc_map_files_readdir() was reworked to call proc_fill_cache()
> with unlocked mmap_sem
>
> v4: (feedback by Tejun Heo and Vasiliy Kulikov)
> - instead of saving data in proc_inode we rather make a dentry name
> to keep both vm_start and vm_end accordingly
> - d_revalidate now honor task credentials
>
> v5: (feedback by Kirill A. Shutemov)
> - don't forget to release mmap_sem on error path
>
> v6:
> - sizeof get used in map_files_info which shrink member a bit on
> x86-32 (by Kirill A. Shutemov)
> - map_name_to_addr returns -EINVAL instead of -1
> which is more appropriate (by Tejun Heo)
>
> v7:
> - add [get/set]attr handlers for
> proc_map_files_inode_operations (by Vasiliy Kulikov)
>
> v8:
> - Kirill A. Shutemov spotted a parasite semicolon
> which ruined the ptrace_check call, fixed.
>
> v9: (feedback by Andrew Morton)
> - find_exact_vma moved into include/linux/mm.h as an inline helper
> - proc_map_files_setattr uses either kmalloc or vmalloc depending
> on how many objects are to be allocated
> - no more map_name_to_addr but dname_to_vma_addr introduced instead
> and it uses sscanf because in one case the find_exact_vma() is used
> only to confirm existence of vma area the boolean flag is used
> - fancy justification dropped
> - still the proc_map_files_get/setattr leaved untouched
> until additional fd/ patches applied first.
>
> v10: (feedback by Andrew Morton)
> - flex_arrays are used instead of kmalloc/vmalloc calls
> - map_files_d_revalidate use ptrace_may_access for
> security reason (by Vasiliy Kulikov)
>
> v11:
> - should use fput and drop !ret test from a loop code
> (feedback by Andrew Morton)
> - no need for 'used' variable, use existing
> nr_files with file->pos predicate
> - if preallocation fails no need to go further,
> simply release mmap semaphore and jump out
>
> v12:
> - rework map_files_d_revalidate to make sure
> the task get released on return (by Vasiliy Kulikov)
>
> v13:
> - proc_map_files_inode_operations are set to be the same
> as proc_fd_inode_operations, ie to include .permission
> pointing to proc_fd_permission
>
> v14: (by Vasiliy Kulikov)
> - for security reason map_files/ entries are allowed for
> readers with CAP_SYS_ADMIN credentials granted only
>
> Signed-off-by: Pavel Emelyanov <xemul@...allels.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov@...nvz.org>
> Reviewed-by: Vasiliy Kulikov <segoon@...nwall.com>
> CC: Tejun Heo <tj@...nel.org>
> CC: Vasiliy Kulikov <segoon@...nwall.com>
> CC: "Kirill A. Shutemov" <kirill@...temov.name>
> CC: Alexey Dobriyan <adobriyan@...il.com>
> CC: Al Viro <viro@...IV.linux.org.uk>
> CC: Andrew Morton <akpm@...ux-foundation.org>
> CC: Pavel Machek <pavel@....cz>
My Reviewed-by is in force.
--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists