lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2f11576a0912070938s44172cb9mda6b49e997ac1d74@mail.gmail.com>
Date:	Tue, 8 Dec 2009 02:38:39 +0900
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	linux-kernel@...r.kernel.org, Trond.Myklebust@...app.com,
	linux-fsdevel@...r.kernel.org
Subject: Re: NFS lockdep lock misordering mmap_sem<->i_mutex_key with 
	2.6.32-git1

(cc to linux-fsdevel)

2009/12/7 Andi Kleen <andi@...stfloor.org>:
> On Mon, Dec 07, 2009 at 09:19:28PM +0900, KOSAKI Motohiro wrote:
>> >
>> > While booting 2.6.32-git1 on a NFS root box I got the following
>> > lockdep warning early at boot. I haven't looked at details.
>>
>> It seems typical ABBA deadlock.
>>
>>  vfs_readdir                          [grab i_mutex]
>>    nfs_readdir
>>      nfs_do_filldir
>>        filldir
>>          copy_to_user
>>            [page_fault]                       [grab mmap_sem]
>>
>>  sys_mmap                             [grab mmap_sem]
>>    do_mmap_pgoff
>>      mmap_region
>>        nfs_file_mmap
>>          nfs_revalidate_mapping
>>            nfs_invalidate_mapping     [grab i_mutex]
>>
>> I guess recent lockdep improvement find old bug.
>
> Thanks for the analysis.
>
> I guess should never do copy_*_user while holding i_mutex? There might
> be lots of cases like that.
>
> -Andi

I'm not sure exactly vfs rule. but at least mm/rmap.c explained
collect order is i_mutex -> mmap_sem

rmap.c
---------------------------------------------------------------------
 * Lock ordering in mm:
 *
 * inode->i_mutex       (while writing or truncating, not reading or faulting)
 *   inode->i_alloc_sem (vmtruncate_range)
 *   mm->mmap_sem
 *     page->flags PG_locked (lock_page)
 *       mapping->i_mmap_lock
 *         anon_vma->lock
 *           mm->page_table_lock or pte_lock
 *             zone->lru_lock (in mark_page_accessed, isolate_lru_page)
 *             swap_lock (in swap_duplicate, swap_info_get)
 *               mmlist_lock (in mmput, drain_mmlist and others)
 *               mapping->private_lock (in __set_page_dirty_buffers)
 *               inode_lock (in set_page_dirty's __mark_inode_dirty)
 *                 sb_lock (within inode_lock in fs/fs-writeback.c)
 *                 mapping->tree_lock (widely used, in set_page_dirty,
 *                           in arch-dependent flush_dcache_mmap_lock,
 *                           within inode_lock in __sync_single_inode)
-------------------------------------------------------------------------------------------------


Plus, ext4 have following comment. it imply nfs mmap implementaion is wrong...

--------------------------------------------------------------------------------------
int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
{
        struct page *page = vmf->page;
        loff_t size;
        unsigned long len;
        int ret = -EINVAL;
        void *fsdata;
        struct file *file = vma->vm_file;
        struct inode *inode = file->f_path.dentry->d_inode;
        struct address_space *mapping = inode->i_mapping;

        /*
         * Get i_alloc_sem to stop truncates messing with the inode. We cannot
         * get i_mutex because we are already holding mmap_sem.
         */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ