[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15eddad0-e73b-2686-b5ba-eaacc57b8947@themaw.net>
Date: Thu, 27 Jul 2023 08:38:48 +0800
From: Ian Kent <raven@...maw.net>
To: Anders Roxell <anders.roxell@...aro.org>
Cc: Arnd Bergmann <arnd@...db.de>, Tejun Heo <tj@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Minchan Kim <minchan@...nel.org>,
Eric Sandeen <sandeen@...deen.net>,
Alexander Viro <viro@...iv.linux.org.uk>,
Rick Lindsley <ricklind@...ux.vnet.ibm.com>,
David Howells <dhowells@...hat.com>,
Miklos Szeredi <miklos@...redi.hu>,
Carlos Maiolino <cmaiolino@...hat.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Kernel Mailing List <linux-kernel@...r.kernel.org>,
elver@...gle.com, imran.f.khan@...cle.com
Subject: Re: [PATCH 1/2] kernfs: dont take i_lock on inode attr read
On 20/7/23 10:03, Ian Kent wrote:
> On Wed, 2023-07-19 at 12:23 +0800, Ian Kent wrote:
>> On 19/7/23 03:00, Anders Roxell wrote:
>>> On 2023-01-23 11:11, Ian Kent wrote:
>>>> On 29/12/22 21:07, Ian Kent wrote:
>>>>> On 29/12/22 17:20, Arnd Bergmann wrote:
>>>>>> On Fri, Dec 23, 2022, at 00:11, Ian Kent wrote:
>>>>>>> On 21/12/22 21:34, Anders Roxell wrote:
>>>>>>>> On 2022-10-31 12:30, Tejun Heo wrote:
>>>>>>>>> On Tue, Oct 18, 2022 at 10:32:42AM +0800, Ian Kent
>>>>>>>>> wrote:
>>>>>>>>>> The kernfs write lock is held when the kernfs node
>>>>>>>>>> inode attributes
>>>>>>>>>> are updated. Therefore, when either
>>>>>>>>>> kernfs_iop_getattr() or
>>>>>>>>>> kernfs_iop_permission() are called the kernfs node
>>>>>>>>>> inode attributes
>>>>>>>>>> won't change.
>>>>>>>>>>
>>>>>>>>>> Consequently concurrent kernfs_refresh_inode() calls
>>>>>>>>>> always copy the
>>>>>>>>>> same values from the kernfs node.
>>>>>>>>>>
>>>>>>>>>> So there's no need to take the inode i_lock to get
>>>>>>>>>> consistent values
>>>>>>>>>> for generic_fillattr() and generic_permission(), the
>>>>>>>>>> kernfs read lock
>>>>>>>>>> is sufficient.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Ian Kent <raven@...maw.net>
>>>>>>>>> Acked-by: Tejun Heo <tj@...nel.org>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Building an allmodconfig arm64 kernel on yesterdays next-
>>>>>>>> 20221220 and
>>>>>>>> booting that in qemu I see the following "BUG: KCSAN:
>>>>>>>> data-race in
>>>>>>>> set_nlink / set_nlink".
>>>>>>> I'll check if I missed any places where set_link() could be
>>>>>>> called where the link count could be different.
>>>>>>>
>>>>>>>
>>>>>>> If there aren't any the question will then be can writing
>>>>>>> the
>>>>>>> same value to this location in multiple concurrent threads
>>>>>>> corrupt it?
>>>>>> I think the race that is getting reported for set_nlink()
>>>>>> is about this bit getting called simulatenously on multiple
>>>>>> CPUs with only the read lock held for the inode:
>>>>>>
>>>>>> /* Yes, some filesystems do change nlink from zero to
>>>>>> one */
>>>>>> if (inode->i_nlink == 0)
>>>>>> atomic_long_dec(&inode->i_sb->s_remove_count);
>>>>>> inode->__i_nlink = nlink;
>>>>>>
>>>>>> Since i_nlink and __i_nlink refer to the same memory
>>>>>> location,
>>>>>> the 'inode->i_nlink == 0' check can be true for all of them
>>>>>> before the nonzero nlink value gets set, and this results in
>>>>>> s_remove_count being decremented more than once.
>>>>> Thanks for the comment Arnd.
>>>> Hello all,
>>>>
>>>>
>>>> I've been looking at this and after consulting Miklos and his
>>>> pointing
>>>>
>>>> out that it looks like a false positive the urgency dropped off a
>>>> bit. So
>>>>
>>>> apologies for taking so long to report back.
>>>>
>>>>
>>>> Anyway it needs some description of conclusions reached so far.
>>>>
>>>>
>>>> I'm still looking around but in short, kernfs will set
>>>> directories to <# of
>>>>
>>>> directory entries> + 2 unconditionally for directories. I can't
>>>> yet find
>>>>
>>>> any other places where i_nlink is set or changed and if there are
>>>> none
>>>>
>>>> then i_nlink will never be set to zero so the race should not
>>>> occur.
>>>>
>>>>
>>>> Consequently my claim is this is a real false positive.
>>>>
>>>>
>>>> There are the file system operations that may be passed at mount
>>>> time
>>>>
>>>> but given the way kernfs sets i_nlink it pretty much dictates
>>>> those
>>>> operations
>>>>
>>>> (if there were any that modify it and there don't appear to be
>>>> any) leave it
>>>>
>>>> alone.
>>>>
>>>>
>>>> So it just doesn't make sense for users of kernfs to fiddle with
>>>> i_nlink ...
>>> On todays next tag, next-20230718 this KCSAN BUG poped up again.
>>> When I
>>> built an allmodconfig arm64 kernel and booted it in QEMU. Full log
>>> can
>>> be found http://ix.io/4AUd
>>>
>>> [ 1694.987789][ T137] BUG: KCSAN: data-race in inode_permission /
>>> kernfs_refresh_inode
>>> [ 1694.992912][ T137]
>>> [ 1694.994532][ T137] write to 0xffff00000bab6070 of 2 bytes by
>>> task 104 on cpu 0:
>>> [ 1694.999269][ T137] kernfs_refresh_inode
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:171)
>>> [ 1695.002707][ T137] kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:289)
>>> [ 1695.006148][ T137] inode_permission
>>> (/home/anders/src/kernel/next/fs/namei.c:461
>>> /home/anders/src/kernel/next/fs/namei.c:528)
>>> [ 1695.009420][ T137] link_path_walk
>>> (/home/anders/src/kernel/next/fs/namei.c:1720
>>> /home/anders/src/kernel/next/fs/namei.c:2267)
>>> [ 1695.012643][ T137] path_lookupat
>>> (/home/anders/src/kernel/next/fs/namei.c:2478 (discriminator 2))
>>> [ 1695.015781][ T137] filename_lookup
>>> (/home/anders/src/kernel/next/fs/namei.c:2508)
>>> [ 1695.019059][ T137] vfs_statx
>>> (/home/anders/src/kernel/next/fs/stat.c:238)
>>> [ 1695.022024][ T137] vfs_fstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:276)
>>> [ 1695.025067][ T137] __do_sys_newfstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:446)
>>> [ 1695.028497][ T137] __arm64_sys_newfstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:440
>>> /home/anders/src/kernel/next/fs/stat.c:440)
>>> [ 1695.032080][ T137] el0_svc_common.constprop.0
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:139)
>>> [ 1695.035916][ T137] do_el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:188)
>>> [ 1695.038796][ T137] el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:144
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:648)
>>> [ 1695.041468][ T137] el0t_64_sync_handler
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:666)
>>> [ 1695.044889][ T137] el0t_64_sync
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:591)
>>> [ 1695.047904][ T137]
>>> [ 1695.049511][ T137] 1 lock held by systemd-udevd/104:
>>> [ 1695.052837][ T137] #0: ffff000006681e08 (&root-
>>>> kernfs_iattr_rwsem){++++}-{3:3}, at: kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:288)
>>> [ 1695.060241][ T137] irq event stamp: 82902
>>> [ 1695.063006][ T137] hardirqs last enabled at (82901):
>>> _raw_spin_unlock_irqrestore
>>> (/home/anders/src/kernel/next/arch/arm64/include/asm/alternative-
>>> macros.h:250
>>> /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:27
>>> /home/anders/src/kernel/next/arch/arm64/include/asm/irqflags.h:140
>>> /home/anders/src/kernel/next/include/linux/spinlock_api_smp.h:151
>>> /home/anders/src/kernel/next/kernel/locking/spinlock.c:194)
>>> [ 1695.069673][ T137] hardirqs last disabled at (82902):
>>> el1_interrupt
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:472
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:488)
>>> [ 1695.075474][ T137] softirqs last enabled at (82792):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1791)
>>> [ 1695.082319][ T137] softirqs last disabled at (82790):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/include/linux/bottom_half.h:20
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1784)
>>> [ 1695.089049][ T137]
>>> [ 1695.090659][ T137] read to 0xffff00000bab6070 of 2 bytes by
>>> task 137 on cpu 0:
>>> [ 1695.095374][ T137] inode_permission
>>> (/home/anders/src/kernel/next/fs/namei.c:532)
>>> [ 1695.098655][ T137] link_path_walk
>>> (/home/anders/src/kernel/next/fs/namei.c:1720
>>> /home/anders/src/kernel/next/fs/namei.c:2267)
>>> [ 1695.101857][ T137] path_openat
>>> (/home/anders/src/kernel/next/fs/namei.c:3789 (discriminator 2))
>>> [ 1695.104885][ T137] do_filp_open
>>> (/home/anders/src/kernel/next/fs/namei.c:3820)
>>> [ 1695.108006][ T137] do_sys_openat2
>>> (/home/anders/src/kernel/next/fs/open.c:1418)
>>> [ 1695.111290][ T137] __arm64_sys_openat
>>> (/home/anders/src/kernel/next/fs/open.c:1433)
>>> [ 1695.114825][ T137] el0_svc_common.constprop.0
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:139)
>>> [ 1695.118662][ T137] do_el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:188)
>>> [ 1695.121555][ T137] el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:144
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:648)
>>> [ 1695.124207][ T137] el0t_64_sync_handler
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:666)
>>> [ 1695.127590][ T137] el0t_64_sync
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:591)
>>> [ 1695.130641][ T137]
>>> [ 1695.132241][ T137] no locks held by systemd-udevd/137.
>>> [ 1695.135618][ T137] irq event stamp: 3246
>>> [ 1695.138519][ T137] hardirqs last enabled at (3245):
>>> seqcount_lockdep_reader_access
>>> (/home/anders/src/kernel/next/include/linux/seqlock.h:105)
>>> [ 1695.145825][ T137] hardirqs last disabled at (3246):
>>> el1_interrupt
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:472
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:488)
>>> [ 1695.151942][ T137] softirqs last enabled at (3208):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1791)
>>> [ 1695.158950][ T137] softirqs last disabled at (3206):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/include/linux/bottom_half.h:20
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1784)
>>> [ 1695.166036][ T137]
>>> [ 1695.167621][ T137] Reported by Kernel Concurrency Sanitizer on:
>>> [ 1695.179990][ T137] Hardware name: linux,dummy-virt (DT)
>>> [ 1695.183687][ T137]
>>> ==================================================================
>>
>> This one is different to the original.
>>
>>
>> I can't see where the problem is here, can someone help me out
>>
>> please.
>>
>>
>> I don't see any shared data values used by the call
>>
>> devcgroup_inode_permission(inode, mask) in
>> fs/namei.c:inode_permission()
>>
>> that might be a problem during the read except possibly inode-
>>> i_mode.
>>
>> I'll check on that ... maybe something's been missed when
>> kernfs_rwsem
>>
>> was changed to a separate lock (kernfs_iattr_rwsem).
>>
>>
>>> [...]
>>>
>>> [ 1738.053819][ T104] BUG: KCSAN: data-race in set_nlink /
>>> set_nlink
>>> [ 1738.058223][ T104]
>>> [ 1738.059865][ T104] read to 0xffff00000bab6918 of 4 bytes by
>>> task 108 on cpu 0:
>>> [ 1738.064916][ T104] set_nlink
>>> (/home/anders/src/kernel/next/fs/inode.c:369)
>>> [ 1738.067845][ T104] kernfs_refresh_inode
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:180)
>>> [ 1738.071607][ T104] kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:289)
>>> [ 1738.075467][ T104] inode_permission
>>> (/home/anders/src/kernel/next/fs/namei.c:461
>>> /home/anders/src/kernel/next/fs/namei.c:528)
>>> [ 1738.078868][ T104] link_path_walk
>>> (/home/anders/src/kernel/next/fs/namei.c:1720
>>> /home/anders/src/kernel/next/fs/namei.c:2267)
>>> [ 1738.082270][ T104] path_lookupat
>>> (/home/anders/src/kernel/next/fs/namei.c:2478 (discriminator 2))
>>> [ 1738.085488][ T104] filename_lookup
>>> (/home/anders/src/kernel/next/fs/namei.c:2508)
>>> [ 1738.089101][ T104] user_path_at_empty
>>> (/home/anders/src/kernel/next/fs/namei.c:2907)
>>> [ 1738.092469][ T104] do_readlinkat
>>> (/home/anders/src/kernel/next/fs/stat.c:477)
>>> [ 1738.095970][ T104] __arm64_sys_readlinkat
>>> (/home/anders/src/kernel/next/fs/stat.c:504
>>> /home/anders/src/kernel/next/fs/stat.c:501
>>> /home/anders/src/kernel/next/fs/stat.c:501)
>>> [ 1738.099529][ T104] el0_svc_common.constprop.0
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:139)
>>> [ 1738.103696][ T104] do_el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:188)
>>> [ 1738.106560][ T104] el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:144
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:648)
>>> [ 1738.109613][ T104] el0t_64_sync_handler
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:666)
>>> [ 1738.113035][ T104] el0t_64_sync
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:591)
>>> [ 1738.116346][ T104]
>>> [ 1738.117924][ T104] 1 lock held by systemd-udevd/108:
>>> [ 1738.121580][ T104] #0: ffff000006681e08 (&root-
>>>> kernfs_iattr_rwsem){++++}-{3:3}, at: kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:288)
>>> [ 1738.129355][ T104] irq event stamp: 31000
>>> [ 1738.132088][ T104] hardirqs last enabled at (31000):
>>> seqcount_lockdep_reader_access
>>> (/home/anders/src/kernel/next/include/linux/seqlock.h:105)
>>> [ 1738.139417][ T104] hardirqs last disabled at (30999):
>>> seqcount_lockdep_reader_access
>>> (/home/anders/src/kernel/next/include/linux/seqlock.h:104)
>>> [ 1738.146781][ T104] softirqs last enabled at (30973):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1791)
>>> [ 1738.153891][ T104] softirqs last disabled at (30971):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/include/linux/bottom_half.h:20
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1784)
>>> [ 1738.161012][ T104]
>>> [ 1738.162663][ T104] write to 0xffff00000bab6918 of 4 bytes by
>>> task 104 on cpu 0:
>>> [ 1738.167730][ T104] set_nlink
>>> (/home/anders/src/kernel/next/fs/inode.c:372)
>>> [ 1738.170559][ T104] kernfs_refresh_inode
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:180)
>>> [ 1738.174355][ T104] kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:289)
>>> [ 1738.177829][ T104] inode_permission
>>> (/home/anders/src/kernel/next/fs/namei.c:461
>>> /home/anders/src/kernel/next/fs/namei.c:528)
>>> [ 1738.181403][ T104] link_path_walk
>>> (/home/anders/src/kernel/next/fs/namei.c:1720
>>> /home/anders/src/kernel/next/fs/namei.c:2267)
>>> [ 1738.184738][ T104] path_lookupat
>>> (/home/anders/src/kernel/next/fs/namei.c:2478 (discriminator 2))
>>> [ 1738.188268][ T104] filename_lookup
>>> (/home/anders/src/kernel/next/fs/namei.c:2508)
>>> [ 1738.191865][ T104] vfs_statx
>>> (/home/anders/src/kernel/next/fs/stat.c:238)
>>> [ 1738.196236][ T104] vfs_fstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:276)
>>> [ 1738.200120][ T104] __do_sys_newfstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:446)
>>> [ 1738.204095][ T104] __arm64_sys_newfstatat
>>> (/home/anders/src/kernel/next/fs/stat.c:440
>>> /home/anders/src/kernel/next/fs/stat.c:440)
>>> [ 1738.207676][ T104] el0_svc_common.constprop.0
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:38
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:52
>>> /home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:139)
>>> [ 1738.211820][ T104] do_el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/syscall.c:188)
>>> [ 1738.214815][ T104] el0_svc
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:133
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:144
>>> /home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:648)
>>> [ 1738.217709][ T104] el0t_64_sync_handler
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry-common.c:666)
>>> [ 1738.221239][ T104] el0t_64_sync
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/entry.S:591)
>>> [ 1738.224502][ T104]
>>> [ 1738.226090][ T104] 1 lock held by systemd-udevd/104:
>>> [ 1738.229747][ T104] #0: ffff000006681e08 (&root-
>>>> kernfs_iattr_rwsem){++++}-{3:3}, at: kernfs_iop_permission
>>> (/home/anders/src/kernel/next/fs/kernfs/inode.c:288)
>>> [ 1738.237504][ T104] irq event stamp: 108353
>>> [ 1738.240262][ T104] hardirqs last enabled at (108353):
>>> seqcount_lockdep_reader_access
>>> (/home/anders/src/kernel/next/include/linux/seqlock.h:105)
>>> [ 1738.247443][ T104] hardirqs last disabled at (108352):
>>> seqcount_lockdep_reader_access
>>> (/home/anders/src/kernel/next/include/linux/seqlock.h:104)
>>> [ 1738.254510][ T104] softirqs last enabled at (108326):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:264
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1791)
>>> [ 1738.262187][ T104] softirqs last disabled at (108324):
>>> fpsimd_restore_current_state
>>> (/home/anders/src/kernel/next/include/linux/bottom_half.h:20
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:242
>>> /home/anders/src/kernel/next/arch/arm64/kernel/fpsimd.c:1784)
>>> [ 1738.270239][ T104]
>>> [ 1738.272140][ T104] Reported by Kernel Concurrency Sanitizer on:
>>> [ 1738.285185][ T104] Hardware name: linux,dummy-virt (DT)
>>> [ 1738.288703][ T104]
>>> ==================================================================
>> This looks just like the original except a different lock is being
>>
>> used now.
>>
>>
>> The link count can't be less than two if set_nlink() is called.
>>
>>
>> Maybe I am missing something but the directory count is changed only
>>
>> while holding the root->kernfs_iattr_rwsem so the value used by
>>
>> set_nlink() will not change during concurrent calls to
>> refresh_inode().
>>
>> Still looks like a false positive, I'll check the write operations
>> again just to be sure.
> I do see a problem with recent changes.
>
> I'll send this off to Greg after I've done some testing (primarily just
> compile and function).
>
> Here's a patch which describes what I found.
>
> Comments are, of course, welcome, ;)
Anders I was hoping you would check if/what lockdep trace
you get with this patch.
Imran, I was hoping you would comment on my change as it
relates to the kernfs_iattr_rwsem changes.
Ian
>
> kernfs: fix missing kernfs_iattr_rwsem locking
>
> From: Ian Kent <raven@...maw.net>
>
> When the kernfs_iattr_rwsem was introduced a case was missed.
>
> The update of the kernfs directory node child count was also protected
> by the kernfs_rwsem and needs to be included in the change so that the
> child count (and so the inode n_link attribute) does not change while
> holding the rwsem for read.
>
> Fixes: 9caf696142 (kernfs: Introduce separate rwsem to protect inode
> attributes)
>
> Signed-off-by: Ian Kent <raven@...maw.net>
> Cc: Anders Roxell <anders.roxell@...aro.org>
> Cc: Imran Khan <imran.f.khan@...cle.com>
> Cc: Arnd Bergmann <arnd@...db.de>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: Eric Sandeen <sandeen@...deen.net>
> ---
> fs/kernfs/dir.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 45b6919903e6..6e84bb69602e 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -383,9 +383,11 @@ static int kernfs_link_sibling(struct kernfs_node
> *kn)
> rb_insert_color(&kn->rb, &kn->parent->dir.children);
>
> /* successfully added, account subdir number */
> + down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
> if (kernfs_type(kn) == KERNFS_DIR)
> kn->parent->dir.subdirs++;
> kernfs_inc_rev(kn->parent);
> + up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>
> return 0;
> }
> @@ -408,9 +410,11 @@ static bool kernfs_unlink_sibling(struct
> kernfs_node *kn)
> if (RB_EMPTY_NODE(&kn->rb))
> return false;
>
> + down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
> if (kernfs_type(kn) == KERNFS_DIR)
> kn->parent->dir.subdirs--;
> kernfs_inc_rev(kn->parent);
> + up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>
> rb_erase(&kn->rb, &kn->parent->dir.children);
> RB_CLEAR_NODE(&kn->rb);
>
Powered by blists - more mailing lists