[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2f11576a0912070938s44172cb9mda6b49e997ac1d74@mail.gmail.com>
Date: Tue, 8 Dec 2009 02:38:39 +0900
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: linux-kernel@...r.kernel.org, Trond.Myklebust@...app.com,
linux-fsdevel@...r.kernel.org
Subject: Re: NFS lockdep lock misordering mmap_sem<->i_mutex_key with
2.6.32-git1
(cc to linux-fsdevel)
2009/12/7 Andi Kleen <andi@...stfloor.org>:
> On Mon, Dec 07, 2009 at 09:19:28PM +0900, KOSAKI Motohiro wrote:
>> >
>> > While booting 2.6.32-git1 on a NFS root box I got the following
>> > lockdep warning early at boot. I haven't looked at details.
>>
>> It seems typical ABBA deadlock.
>>
>> vfs_readdir [grab i_mutex]
>> nfs_readdir
>> nfs_do_filldir
>> filldir
>> copy_to_user
>> [page_fault] [grab mmap_sem]
>>
>> sys_mmap [grab mmap_sem]
>> do_mmap_pgoff
>> mmap_region
>> nfs_file_mmap
>> nfs_revalidate_mapping
>> nfs_invalidate_mapping [grab i_mutex]
>>
>> I guess recent lockdep improvement find old bug.
>
> Thanks for the analysis.
>
> I guess should never do copy_*_user while holding i_mutex? There might
> be lots of cases like that.
>
> -Andi
I'm not sure exactly vfs rule. but at least mm/rmap.c explained
collect order is i_mutex -> mmap_sem
rmap.c
---------------------------------------------------------------------
* Lock ordering in mm:
*
* inode->i_mutex (while writing or truncating, not reading or faulting)
* inode->i_alloc_sem (vmtruncate_range)
* mm->mmap_sem
* page->flags PG_locked (lock_page)
* mapping->i_mmap_lock
* anon_vma->lock
* mm->page_table_lock or pte_lock
* zone->lru_lock (in mark_page_accessed, isolate_lru_page)
* swap_lock (in swap_duplicate, swap_info_get)
* mmlist_lock (in mmput, drain_mmlist and others)
* mapping->private_lock (in __set_page_dirty_buffers)
* inode_lock (in set_page_dirty's __mark_inode_dirty)
* sb_lock (within inode_lock in fs/fs-writeback.c)
* mapping->tree_lock (widely used, in set_page_dirty,
* in arch-dependent flush_dcache_mmap_lock,
* within inode_lock in __sync_single_inode)
-------------------------------------------------------------------------------------------------
Plus, ext4 have following comment. it imply nfs mmap implementaion is wrong...
--------------------------------------------------------------------------------------
int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
{
struct page *page = vmf->page;
loff_t size;
unsigned long len;
int ret = -EINVAL;
void *fsdata;
struct file *file = vma->vm_file;
struct inode *inode = file->f_path.dentry->d_inode;
struct address_space *mapping = inode->i_mapping;
/*
* Get i_alloc_sem to stop truncates messing with the inode. We cannot
* get i_mutex because we are already holding mmap_sem.
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists